Tải bản đầy đủ - 0 (trang)
2 PhyDL: Physical Design Specific Language

2 PhyDL: Physical Design Specific Language

Tải bản đầy đủ - 0trang


A. Ouared et al.

Fig. 3. Excerpt of PhyDL meta-model: core entities

Context class) of a given Manifest/Algorithm/cost-model is described by a set

of database system parameters. Those parameters are related to different categories. Meta-modeling these parameter categories and their attributes lead to

numerous classes and enumerations. Then, Fig. 4 gives only an brief view of our

meta-model classes that correspond to context parameters. There are various

types of parameters scattered in the literature. To organize them, we propose

four categories of parameters:

Fig. 4. Excerpt of PhyDL meta-model: focus on context parameters and their categories

A Meta-advisor Repository for Database Physical Design


– Database parameters: elements of this category are related to the database and

different functionalities that have to be provided by the database management

system (DBMS). Through this category we precise context parameters of storage systems (for instance relational or non-relational), buffer management (e.g.

the buffer pool size) and database schema (e.g. tables/columns, partitioning


– Hardware parameters: generally, the hardware context parameters define

device characteristics, such as processing device (e.g. CPU, Graphical Processing Units GPU, etc.), different storage device (e.g. Main-Memory, Solid State

Drives SSD or Hard Disk Drive HDD) and communication device.

– Query parameters: the query parameters mean concepts used to perform a set

of algebra operations (join, select, etc.). The operation can be unary or binary

function. The result of the query can be restricted by a set of predicates. These

operators should perform as fast as possible by exploiting underlying an access

method (e.g., index methods or in-memory access methods [19]).

– Architecture parameters: elements of this category contain the deployment

architecture such as: distributed or parallel database systems, database clusters, or cloud environments. These category parameters feed the cost model

context on the type of the system architecture (e.g. shared memory, shared

disc, etc.) and their parameters like the number of nodes in parallel environment.

Figure 5 focuses on Algorithm class and its related elements. We recall, that

the meta-modeled classes are only related to the domain of the physical database

design. Each algorithm is characterized by a name and references which indicate the scientific papers presenting the optimization algorithm. Every instance

of the Algorithm class also has at least one algorithm type: constraint programming, deterministic, random and hybrid. Furthermore, each algorithm has

several parameters AlgorithmParameter. All parameters considered in the algorithm fall into one of the following categories: VariationOperator, StoppingCriteria, SolutionCodding, FitnessFunction, Initialization and UserDefinedParamter )

(see Fig. 5). The optimization algorithms are coupled with the characteristics of

the system under design (i.e. context). Since, the context is also defined by a set

of parameters (as shown in Fig. 4), some of the algorithm parameters may be

matched to those of the context thanks to the ContextMappingRelation class

(see Fig. 5).

Figure 6 depicts the CostModel class. Every CostModel instance is composed

of a metric (instance Metric class) and a cost function (instance CostFunction

class). A cost model can have its own context. The parameters of this latter are

a sub set of the global context of the algorithm. The cost model is also characterized by references to indicate the scientific papers where it is presented. Every

instance of the CostModel class also has at least one cost type. Thanks to the

CostFunction class, the mathematical formula of a cost model is supported in a

structured way. The CostFunction class consists of two parts of operands: logical costs and physical costs. In fact, an operand may be a given real value (i.e.

instance of ConstantValue class) or derived from other context parameters.


A. Ouared et al.

Fig. 5. Excerpt of PhyDL meta-model: focus on algorithm class and its relationships

In this case, the operand is represented by an instance of CalculatedValue

class. Indeed, that corresponds to what we have previously highlighted, when

we have said that the main math formula assigned to a cost function may be

composed of other basic ones. We integrate the MathML [2] package into MetaAdvisor meta-model. Thanks to this, the math formulas of CostFunction and

CalculatedValue classes can be expressed (instance of MathType class). Note

that to keep the traceability and the origin of all math formulas operands, the

Parameter, LogicalCost and PhysicalCost classes inherit from the MiType

class of the MathML package. We recall that MiType class allows to define variables of equations.


Meta-advisor Repository

Figure 7 depicts the high-level workflow of how our repository can be used. Two

kinds of users (providers and seekers) interact with the repository via the two

interface skeletons. Thus, our system produces different views based on users

roles. The skeletons of the user API (Application Programming Interface) have

been developed based on PhyDL design language. In the following, we detail the

possible scenario usage of the repository.

Enrichment Flow: A provider (e.g. academic researcher) should use the providing interface to insert own optimization algorithms and cost models. Through

this interface (which plays the role of a design tool), the modeling of the optimization algorithm, its cost models, its context and the addressed problem is

possible. Once the provider obtains a model conforms to the PhyDL meta-model,

A Meta-advisor Repository for Database Physical Design


Fig. 6. Excerpt of of PhyDL meta-model: focus on Cost-Model class

the providers can upload it. A model conforms to a meta-model is like a program conforms to the programming languages grammar in which it is written.

In fact, the uploaded model is serialized an XML file (based on XML Metadata

Interchange schema) [11]). We developed an enrichment process which deals

transforming the uploaded model to SQL statements. The SQL statements is

mainly based on “INSERT” queries. This transformation is implemented in a

model-to-text language called Acceleo [20]. Therefore, we store in the database

the information related to the algorithms contained in the XML file and also the

link of file location. This process should be repeated for each algorithm eligible

to be shared. Actually, the trust of the repository is managed manually by a

moderator. Thus, the insertion process passes by a staging area.

Selection Flow: This flow should help seekers to reuse optimization algorithms

and cost models provided by other researchers. A seeker (e.g. DBA) should use

the selection interface to search for appropriate optimization algorithms and

cost models. The seeker has first the design tool in order to express the manifest. Each expressed manifest is a model conforms to the PhyDL meta-model,

and it is serialized as an XMI file. The developed selection process deals with

transforming the manifest model to a SQL statements. The SQL statements is

based on “SELECT” queries.


A. Ouared et al.

Fig. 7. Meta-advisor design repository

In case where the repository proposes a result that matches the manifest, the

seeker can download the result. In fact the result is an XMI file that has been

already uploaded by a provider. The accuracy of the request result depends on

the repository content and the correctness of the demand. Thus, the very rich

the repository is, the efficient is the result.


Case Study

This section is devoted to stress the proof of concept of our contribution. The

goal is to show the implementation of our tools.

We have developed a design tool allowing to create and visualize a physical

design problem conform to the PhyDL design language. The design tool is based

on Java EMF (Eclipse Modeling Framework) API and has been integrated as a

plugin in Eclipse2 which is an Integrated development Environment. Hereafter,

we present two usage scenarios showing the selection and enrichment of the MetaAdvisor repository. This latter refers to the example presented in Sect. 2.2.


Storing in the Repository

The model of Fig. 8 is an instance of the optimization algorithm. This model is

expressed by PhyDL language. As it is shown in Fig. 8, the model contains also

the context elements, the problem, the cost models.

Figure 9 shows how the XMI file of the model can be downloaded in order

to be stored in the repository. As we have presented in the contribution section

(see Sect. 3.3), the insertion process is based on a code generator component

enabling to generate the SQL script. The script serves to insert the different

kinds (context, algorithms, problems, cost models, etc.) of instances contained

in the model. Listing 1.1 shows an example of the generated script.


Eclipse Modeling Project. www.eclipse.org/modeling/.

A Meta-advisor Repository for Database Physical Design


INSERT INTO Algorithm ( AlgorithmID , name , type , selectionType , reference ,

description ) VALUES ( ’ ALG_0010 ’ , ’ NSGA II ’ , ’ Random ’ , ’ Multiple ’ , ’A .

Roukh , L . Bellatreche , .. ’ , ’ we provide a multi - objective ... ’) ;

INSERT INTO Context ( id , name , Algorithm_id ) VALUES ( ’ CXT_10 ’ , ’ Context 1 ’ , ’

ALG_0010 ’) ;

INSERT INTO A r c h i t e c t u r e P a r a m e t e r ( id , architecture , type , value ,

Context_id ) VALUES ( ’ 10 ’ , ’ Centralized ’ , ’ SharedNothing ’ , ’ ALG_0010 ’ , ’

CXT_10 ’) ;


Listing 1.1. Excerpt of SQL queries to insert a new algorithm

SELECT Algorithm .*

FROM Algorithm , Context , Problem , DatabaseParameter , HardwareParameter ,

ArchitectureParameter , QueryParameter

WHERE Problem . Algorithm_id = Algorithm . id

AND CostModel . P h y s i c a l D e s i g n _ i d = Algorithm . id


AND Problem . type = ’ VSP ’ AND Problem . constraint in ( ’ storageCost ’ , ’

maintenanceCost ’) AND Problem . NFR = IN ( ’ energy ’ , ’ responseTime ’)

AND D a t a b a s e P a r a m e t e r . o p t m i z a t i o n S t r u t u r e = ’ materializedView ’ AND

D a t a b a s e P a r a m e t e r . d a t a S t o r a g e T y p e = ’ rowOriented ’ AND D a t a b a s e P a r a m e t e r .

storageSystem = ’ c o n v e n t i o n a l D a t a b a s e S q l ’ AND A r c h i t e c t u r e P a r a m e t e r . type = ’

centralized ’ AND A r c h i t e c t u r e P a r a m e t e r . kind = ’ sharedNothing ’ AND

QueryParameter . type = ’ OLAP ’;

Listing 1.2. Excerpt of SQL query to select algorithms

Fig. 8. Example of an optimization algorithm, as it is presented in [22] expressed in



A. Ouared et al.

Fig. 9. Screenshot of the provider interface


Searching from the Repository

In this part of the case study, we illustrate how one can use the seeker interface

to search in our repository. The left part of the system depicted in Fig. 10 is an

example showing how a manifest can be specified. Each manifest represents a

specific search, and each search is based on the execution of a generated SQL

script. Listing 1.2 shows the SQL query corresponding to the characteristics of

the manifest.

In case where results are found due to the search related to the manifest characteristics, the result should be presented in the right side of the tool (See right

side of the system shown in Fig. 10). The tool displays the relevant information

of each result, then if one needs to examine deeply and visualizes a proposed

solution, one can download it as an XMI file.

Fig. 10. Screenshot of seeker interface

A Meta-advisor Repository for Database Physical Design



Related Work

By exploring the literature, numerous research works and design tools have tackled the physical database design problems. At the core of designer tools are

optimization algorithms and cost models which is used to estimate and compare the performance of query. The industrial design tools have been proposed

such as (e.g., Index Tuning Wizard [1], Tuning Advisor in Microsoft SQL Server

[8], Teradatas Index Wizard [7], IBM DB2s Design Advisor [28], Oracles SQL

Tuning Adviser [9]. Other types of design problems include project selection in

columnar database such as Verticas DBD [26]. Design tools have the advantage

of automate and facilitate the database optimization and maintenance. However, each design tool suffer from several limitations. First, it depends on the

nature of the system which needs modeling. Second, the optimization algorithms

is specific to a particular problem and can’t be generalized. Third, the proposed

tools cannot offer all optimization algorithms and cost models. Last, they are

provided as a black-box (i.e., without modifying their internal implementations).

Also, open source paradigms and academic tools have been proposed, such as

SimulPh.D [3] and Parinda for Postgres [17] which assists DBA selecting horizontal partitioning and indexes optimization structures, they use some advanced

algorithms such as genetic, simulated annealing, etc. Similarly to industrial solutions, the academic researches and tools have their limitations. For instance, to

reuse an existing optimization algorithm for both producing and reproducing

analysis result of research papers, generally one extracts them manually. Thus,

this penalizes the reuse and the comparison of existing algorithms. Based on

this discussion, in this paper we propose the construction of physical design

repository, inspired by the database physical design and making it evolving to

consider the evolution of the database and the technology. Recently, the computational science community spends a lot of efforts in building repositories of data

issued from their experiments and simulations for analysis, reuse and reproduction purposes. The cTuning 3 repository is an example of these initiatives. It is

open-source, customizable Collective Knowledge Repository for physics domain.

Similar efforts have been conducted by the process community. APROMORE

(Advanced Process Model Repository) is an example of these initiative [16]. [25]

propose a repository for APIs (Application programming interface) to facilitate

the development of new advanced applications. [13] propose A Model Repository

Description Language MRDL for the development and use of model repositories.



In this paper, we focused on the database physical design by studying optimization algorithms and analyzing their usage. This study was motivated by the

existence of a panoply of optimization algorithms and their important impact

on the databases physical design. To enhance the reuse of existing optimization

algorithms and help the database community to cope with the various concept of




A. Ouared et al.

this domain, we suggested a model-based approach called Meta-Advisor. First,

we suggested PhyDL a Physical Design specific Language dedicated to database

Physical Design. The goal of the language is to offer to the database community

an unified formalism to express the optimization algorithms and cost models.

The language can also be used to express the manifest of users, especially DBAs.

Secondly, we proposed an open repository in order to capitalize the knowledge

relaying on the database physical design in order to decrease the difficulty related

different usages such as the visualization, the evolution, the sharing and the reuse

of existing advisors. A proof of feasibility and practicability of our proposition

was also presented.

Currently, we are testing our propositions by students following our Advanced

Databases course in order to get their feed-backs for possible improvements.

We are also proposing a mechanism making our system trustworthy from the

provider side (to replace the moderator).


1. Agrawal, S., Chaudhuri, S., Narasayya, V.: Materialized view and index selection

tool for microsoft SQL server 2000. ACM SIGMOD Rec. 30(2), 608 (2001)

2. Asperti, A., Padovani, L., Coen, C.S., Guidi, F., Schena, I.: Mathematical knowledge management in HELM. Ann. Math. Artif. Intell. 38(1–3), 27–46 (2003)

3. Bellatreche, L., Boukhalfa, K., Alimazighi, Z.: SimulPh.D.: a physical design simulator tool. In: Bhowmick, S.S., Kă

ung, J., Wagner, R. (eds.) DEXA 2009. LNCS,

vol. 5690, pp. 263–270. Springer, Heidelberg (2009)

4. Bellatreche, L., Bress, S., Kerkad, A., Boukorca, A., Salmi, C.: The generalized

physical design problem in data warehousing environment: towards a generic cost

model. In: MIPRO, pp. 1131–1137. IEEE (2013)

5. Bellatreche, L., Cheikh, S., et al.: How to exploit the device diversity and database

interaction to propose a generic cost model? In: IDEAS. ACM (2013)

6. Boukorca, A., Bellatreche, L., Cuzzocrea, A.: SLEMAS: an approach for selecting

materialized views under query scheduling constraints. In: COMAD, pp. 66–73.

Computer Society of India (2014)

7. Brown, D.P., Chaware, J., Koppuravuri, M.: Index selection in a database system,

3 March 2009. US Patent 7,499,907

8. Chaudhuri, S., Narasayya, V.: Self-tuning database systems: a decade of progress.

In: VLDB, pp. 3–14. VLDB Endowment (2007)

9. Dageville, B., Das, D., Dias, K., Yagoub, K., Zait, M., Ziauddin, M.: Automatic

SQL tuning in Oracle 10g. In: VLDB, pp. 1098–1109. VLDB Endowment (2004)

10. Giurgiu, I., Botezatu, M., Wiesmann, D.: Comprehensible models for reconfiguring

enterprise relational databases to avoid incidents. In: CIKM. ACM (2015)

11. O.M. Group: OMG MOF 2 XMI mapping specification. Version 2.4.1 (2011).

http://www.omg.org/spec/XMI/2.4.1/. Accessed 4 June 2016

12. Gupta, H., Mumick, I.S.: Selection of views to materialize under a maintenance

cost constraint. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540,

pp. 453–470. Springer, Heidelberg (1998)

13. Hamid, B.: A Model Repository Description Language - MRDL. In: Kapitsaki, G.,

Santana de Almeida, E. (eds.) ICSR 2016. LNCS, vol. 9679, pp. 350–367. Springer,

Heidelberg (2016). doi:10.1007/978-3-319-35122-3 23

A Meta-advisor Repository for Database Physical Design


14. Iman, E., Ashraf, A., Daniel, C.Z., Calisto, Z.: Recommending XML physical

designs for XML databases. VLDB J. 22(4), 447–470 (2013)

15. Kerkad, A., Bellatreche, L., Richard, P., Ordonez, C., Geniet, D.: A query beehive

algorithm for data warehouse buffer management and query scheduling. IJDWM

10(3), 34–58 (2014)

16. La Rosa, M., Reijers, H.A., et al.: Apromore: an advanced process model repository.

Expert Syst. Appl. 38(6), 7029–7040 (2011)

17. Maier, C., Dash, D., Alagiannis, I., Ailamaki, A., Heinis, T.: PARINDA: an interactive physical designer for PostgreSQL. In: EDBT, pp. 701–704. ACM (2010)

18. Mami, I., Bellahsene, Z.: A survey of view selection methods. ACM SIGMOD Rec.

41(1), 20–29 (2012)

19. Manegold, S., Boncz, P.A.: Optimizing database architecture for the new bottleneck: memory access. VLDB J. Int. J. Very Large Data Bases 9, 231–246 (2000)

20. Musset, J., Juliot, E., Lacrampe, S.: Acceleo r´ef´erence. Technical report, Obeo et

Acceleo (2006)

21. Object Management Group. OMG Unified Modeling Language. Superstructure,

Version 2.4.1. http://www.omg.org/spec/UML/2.4.1/

22. Roukh, A., Bellatreche, L., Boukorca, A., Bouarar, S.: Eco-DMW: Eco-design

methodology for data warehouses. In: ACM DOLAP, pp. 1–10. ACM (2015)

23. Schmidt, D.C.: Model-driven engineering. Comput. IEEE Comput. Soc. 39(2), 25


24. Steinberg, D., Budinsky, F., et al.: EMF: Eclipse Modeling Framework. The Eclipse

Series (2008). Gamma, E., Nackman, L., Wiegand, J. (eds.)

25. Sun, Y.-J.J., Barukh, M.C., Benatallah, B., Beheshti, S.-M.-R.: Scalable SaaSbased process customization with casewalls. In: Barros, A., Grigori, D., Narendra,

N.C., Dam, H.K. (eds.) ICSOC 2015. LNCS, vol. 9435, pp. 218–233. Springer,

Heidelberg (2015). doi:10.1007/978-3-662-48616-0 14

26. Varadarajan, R., Bharathan, V., et al.: DBdesigner: a customizable physical design

tool for vertica analytic database. In: ICDE, pp. 1084–1095. IEEE (2014)

27. Zhang, N., Tatemura, J., Patel, J.M., Hacigă


uás, H.: Towards cost-eective storage

provisioning for DBMSS. VLDB 5(4), 274–285 (2011)

28. Zilio, D.C., Zuzarte, C., et al.: Recommending materialized views and indexes with

the IBM DB2 design advisor. In: ICAC, pp. 180–187. IEEE (2004)

Linked Service Selection Using the Skyline


Mahdi Bennara1(B) , Michael Mrissa2 , and Youssef Amghar1


Universit´e de Lyon, LIRIS, INSA-Lyon - CNRS UMR5205, 69621 Lyon, France



Universit´e de Lyon, LIRIS, Universit´e Lyon 1 - CNRS UMR5205,

69622 Lyon, France


Abstract. Recently, resource oriented computing has changed the way

Web applications are designed. Because of the increasing number of APIs,

centralized repositories are no longer a viable option for discovery. As a

consequence, a decentralized approach is needed in order to enable valueadded applications. In this paper, we propose a client-side QoS-based

selection algorithm that can be executed along the discovery stage. Our

solution provides different alternatives based on the skyline approach to

select resources and maintain acceptable time performance.

Keywords: RESTful linked web services

Quality of service








In the past twenty years, SOAP-based Web services have helped reaching syntactic level interoperability for distributed applications on the Web. More recently,

resource-oriented computing, and in particular the REST architectural style [5],

has revised the way we interact with services, bringing in new advantages such

as uniform interface (and consequently generic client), HATEOAS1 (hypertextdriven applications), cacheability, etc. In parallel, the Web of services has started

an evolution towards semantic-level interoperability, with a lot of work around

semantically described Web services [6,7] to allow services to exchange semantically annotated data. This evolution has moved towards linked data. Combining

the RESTful architectural style with semantic annotations has unlocked the benefits of using linked data for Web applications. We talk about Linked Web Services which are RESTful services described with linked data and that exchange

linked data. Despite the evolution of service technologies, the need for service

composition to build complex applications is still present. However, the challenges have changed. As centralized solutions for service discovery have proven

not to scale well [2], the need for distributed service discovery has emerged [9].


Hypermedia As The Engine Of the Application State.

c Springer International Publishing Switzerland 2016

L. Bellatreche et al. (Eds.): MEDI 2016, LNCS 9893, pp. 88–97, 2016.

DOI: 10.1007/978-3-319-45547-1 7

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

2 PhyDL: Physical Design Specific Language

Tải bản đầy đủ ngay(0 tr)