International Journal of Performability Engineering, 2019, 15(2): 559-570 doi: 10.23940/ijpe.19.02.p20.559570

Eliciting Data Relations of IOT based on Creative Computing

Lin Zoua,b, Qinyun Liua, Sicong Ma,a, and Fengbao Mac

a Centre for Creative Computing, Bath Spa University, Bath, BA2 8BN, England, UK

b Collaborative Innovative Centre of e-Tourism, Beijing Union University, Beijing, 100101, China

c Beijing Institute of Fashion Technology, Beijing, 100029, China

Corresponding authors: E-mail address: sicong.ma13@bathspa.ac.uk

Received: 2018-11-16   Online: 2019-02-25

Abstract

Internet of things aims to create valuable results by responding to changing environments intelligently and creatively. Expected properties of Internet of things include autonomous, cooperative, situational, evolvable, emergent, and trustworthy, which are also required by any business operators. An approach is generated in this research to extract data relations from a multitude of business information through the design of a Game Theory Data Relations Generator (GTDRG) framework that extracts data relations by deducing the relationship among factors in game theory models. GTDRG relies on both creative computing and Internet of things combined base in order to generate outputs including, but not limited to, relation graph or game theory model mapping through GTDRG. When the proposed framework is established, it is crucial to interpret business information according to user requirements and make Internet of things components react to a changing environment. In summary, users’ needs and requirements can be satisfied by software through the help of the designed model, so that not only can existing data relation be extracted, but also the software can be built to make predictions based on data relations; and more importantly, it can save costs for organisations in addition to improving the effectiveness and efficiency of businesses in a creative way.

Keywords: Internet of things; creative computing; game theory; Semantic Web; data relation; big data

PDF (610KB) Metadata Metrics Related articles Export EndNote| Ris| Bibtex  Favorite

Cite this article

Lin Zou. Eliciting Data Relations of IOT based on Creative Computing. [J], 2019, 15(2): 559-570 doi:10.23940/ijpe.19.02.p20.559570

1. Introduction

Creative computing aims at guiding humans to create innovative, surprising, and useful systems to achieve effective service functions in distinct fields. Creative computing can be used to combine knowledge from different disciplines, which leads to the evolution of boundaries among distinct fields. The combined use of knowledge will be able to solve new problems because the knowledge is the basis of generating creativity. The business environment contains remarkably similar features to Internet of things properties, and what businesses demand most from the software paradigm is the prediction ability in order to meet dynamic economics changes. According to Mei’s Study, dynamic changes can be sensed by Internet of things software entities in continuous monitoring of environment variables, and then the structure and behaviour of Internet of things software can adapt itself. Data relations among business entities alters invariably due to continuous changes in business entities; therefore, the extraction of dynamic data relations may use the assistance of creative computing. Internet of things operates on behalf of business actors in changing business environments. Their reconfiguring, troubleshooting, and in general maintenance tasks lead to costly and time-consuming procedures of organisation transformation. Therefore, it is highly desirable to have automated data relations extraction and achieve desired quality requirements within a reasonable cost and time range.

In the proposed approach, knowledge from various domains including business, psychology, and computing has joined forces to create the Game Theory Data Relations Generator (GTDRG) by applying creative computing methodology. The data relations are generated through different data sources of heterogeneous structure and quality. All the data is required to be turned into valuable information, which can assist actors in making strategic decisions. In the conventional management methods, structured data can be collected from the organisation’s system in scenic areas, and the non-structured data is received from tiny-cameras and any other electronic machines in the scenic regions, but the staff is confused about applying these figures to solve problems. Big data methods should be implemented as they flow into the organisation to make it valuable. If this dynamic volume database can be efficiently and creatively used to improve working efficiency and quality, the latent value of data may be developed in detailed areas. In this method, tough problems can be solved by humans assisted by big data methods, which means data relations are urgently required.

This paper aims to extract data relations of Internet of things by applying creative computing in business-related big data through the combined use of cross-disciplinary knowledge. The aim is trying to increase the efficiency, effectiveness, and creativity of Internet of things software by applying Semantic Web and big data technology in the ideal game theory model. As a final result, a creative Internet of things software framework is made to achieve the functionality of data relations extraction. Related works about creative computing, Semantic Web, data mining, and natural language processing will be reviewed first, then per sub-system of the Game Theory Data Relations Generator (GTDRG) framework will be presented in detail, and the core algorithm of the semantic labeller will be explained step by step.

2. Background

2.1. Creative Computing

Computer systems are combined with human creativity to obtain a creative algorithm that can achieve problem solution by creative computers. It has a profound impact on the creativity domain since human creativity is ambiguous and uncontrollable [1]. However, the computer system is stable and has some unambiguous constraints. An accurate “number” as a “fact” could be output by a computer system.

Although humans have regarded computers as imperative creative machines, software development also income numbers of constraints and is enforced by compromises. Thus, it cannot be denied that creativity usually comes from breaking constraints. An unsophisticated software can be recognised by successful creativity and can be renovated. The software revocation may be challenged by overcoming vagaries, messiness, unrelatedness, and contradictions in order to serve people more smartly.

Conventionally, new challenges and problems can be addressed by machines. Therefore, computing abstracts compromise through several levels from user requirements to the specific path. Computing can be more sophisticated in this way. Currently, people are required to adapt to creative computing applications. The unambiguity from human creative ideas will be able to be influenced by effective computing.

The requirements for creative computing are explicit. Modern, sophisticated problems are difficult for people to solve without assistance, such as business model deployment. Users can establish business models with guide from creative computing by combining knowledge between business and other fields. Natural science, arts, and humanities may face dramatic and complicated questions and constraints. There are similar situations in the business field. Sophisticated problems usually exist because knowledge of a single domain is insufficient to address. Even computer systems have intrinsic constraints to limit their capability to challenge complex problems achieving creativity. Thus, creative computing is proposed to provide more satisfied creative approaches for people in some fields.

In order to achieve creativity, both human creative ideas and software development are imperative. Creativity is required collaboration between creative thoughts and software engineering professional knowledge. The creativity establishing process can be regarded as reformulation that will find an approach to solve the unrelated problems. As the business model creating process has been developed for years, there are requirements for corporations utilising creative computing to achieve knowledge combination and establish creative business models or creative business model developing methods.

2.2. Data Relations

Besides the kind of literal understanding of linked data, it also can be referred to as a method, practice, or even technology to publish data on the Web [4], which will end up with a Web of data. In order to elaborate it, another concept of “Web of documents” should be discussed first. Nowadays, to organise countless data, information, knowledge, and even wisdom, the main method to represent various resources on the Web is through Web pages that also can be deemed as documents. There are a variety of supporting technologies, such as HTML, CSS, and so on. The documents are interconnected by hyperlinks that can lead users from one page to another (e.g., Wikipedia) [5]. This kind of Web is known as Web of documents. The shortcoming of the current Web is that different applications cannot share or integrate materials. Due to the high complexity of various tasks, the user experience is required to be more and more precise and comprehensive. The smaller the process unit is, the better the result can be. That is why we need linked data, as it uses single data as a unit. Foreseeing its enormous potential, Tim Berners-Lee, as the inventor of World Wide Web (WWW) and the main originator of Semantic Web, coined the term in a note in 2006. The aim of linked data is to evolve the Web into a global data space [6].

One of the most typical examples of linked data is called the Linking Open Data Project [7]. It was started in February 2007 by Chris Bizer and Richard Cyganiak and is supported by the W3C Semantic Web Education and Outreach (SWEO) Interest Group [6]. The ultimate aim of the project is to convert and link existing open datasets. The term “open data” was proposed along with a movement known as “open government [8]” aims to establish a modern collaboration among various fields. Nowadays, it is all about opening information and data, as well as making it possible to use and re-use it. That is why we need to turn from open data to linked open data. The path was described by Tim Berners-Lee when he first presented his 5-star model at the Gov 2.0 Expo in Washington DC in 2010 [9]. Inspired by it, some more projects (e.g., Europeana Linked Open Data) are continuously emerging.

2.3. Semantic Web and Resource Description Framework

Linked data cannot be discussed without talking about Semantic Web. There are many different opinions on the connections between linked data and Semantic Web. One of the widely accepted opinions is that linked data is deemed as the most basic foundation of Semantic Web. The concept of “Semantic Web” is interchangeable with Web of data [10]. In W3C’s vision, Semantic Web is the Web of linked data [11]. Nowadays, as mentioned before, people are required to share increasing work for people, even thinking work. In order to better assist people, computing is becoming more and more intelligent. This is why machine-readable data is in demand. The kind of “relation” among various linked data is able to allow “semantics” to be added on the Web.

Linked data is empowered by various technologies. It is worth discussing one of the essentials known as RDF that is the abbreviation of Resource Description Framework [12]. It is a kind of meta-data (i.e., data about data). RDF data is composed of triples, and each triple has the format of “subject-predicate-object”. Accordingly, in order to create linked data, there are four principles [13] defined by Tim Berners-Lee to comply with:

·Use URIs to name or identify things.

·Use HTTP URIs so that these things can be interpreted.

·Provide useful information about what a name identifies when it is looked up, using open standards such as RDF, SPARQL, etc.

·Refer to other things using their HTTP URI-based names when publishing data on the Web.

Therefore, each part of a triple is a URI (Uniform Resource Identifier) that represents an online resource. Based on this kind of mechanism, the “relation” or interrelationship can be constructed among the countless number of resources on the Web to enable a Web of data (i.e., Semantic Web).

There are many serialisation formats of RDF data, such as RDF/XML. It is the first standard format for serializing RDF data and is based on XML (Extensible Markup Language) syntax. Although the RDF/XML format is still in use, many RDF users now prefer other RDF serializations like N-Triples, both because they are more human-friendly, and because some RDF graphs are not representable in RDF/XML due to restrictions on the syntax of XML. The specific query language of RDF is called SPARQL. SPARQL to RDF data is like SQL to relational data. It can be seen that the most important feature of this kind of data is the relationships between different online resources. Depending on the relationships in RDF data, the Semantic Web could allow computing machines to reason over it [14].

DBpedia is a popular example of Semantic Web or Web of linked data based on RDF [15-16]. It is an RDF version of Wikipedia. It extracts structured data from Wikipedia and presents it as linked data. By transforming Wikipedia into a structured knowledge base, DBpedia provides a perfect platform for the development of various Semantic Web applications.

2.4. Text Mining

Text mining, which is equivalent to text data mining or text analysis, aims at essential information extraction from large volumes of text data [17]. In the real world, large volumes of information are usually recorded in text databases and consist of all kinds of text documents, including information retrieval, lexical analysis to analyse words frequency distributions, pattern recognition tagging, etc. [18] Basing on the rapid development of electronic text information, text mining has become a popular research topic in the information research domain.

The recorded data in the text database may be highly-structured (such as websites in the World Wide Web), semi-structured (like emails and some of the XML webpages), and some structured data. One of the representatives of structured data in the text database is documents in the library database [17]. The documents in the library contain both structured data (title, author, publishing date, length, classification, etc.) and non-structured data (abstract and content). [18] Generally, excellent databases can be established through relationship database system application. However, according to non-structured data, special addressing methods must be employed to transfer it.

Text mining means obtaining information that users are interested in or useful data from non-structured text information [17]. It is an approach to obtain unpredictable, understandable, and consequently useful knowledge. At the same time, it is easier to organise information as references for future research.

The original implementation of text mining is extracting unknown knowledge from non-processing data. It is difficult because the data that is required to address involves ambiguity and non-structured characteristics. [18] Therefore, the process of text mining is complicated. It also combines massive subjects, such as information technology, text analysis, model recognition, statistics, data visualisation, database techniques, and machine learning and data mining techniques. The text mining is developed basing on data mining, and thus the processing procedure and theory of text mining are similar to the data mining.

2.5. Natural Language Processing Model

Natural language processing (NLP) is an essential spatial in computer science and artificial intelligence fields. It contains theories and approaches that can be applied in achieving effective communication between humans and machines. [19] Natural language processing is an interdisciplinary domain concluding language, computer science, and mathematics. Therefore, natural language processing has close relationships with language research. However, it focuses on not only language but also an interactive language system between humans and machines. The software system is regarded as a part of computer science. The NLP is a domain that complicates computer science, artificial intelligence, and language focusing on computer language and human language interactions.

As we all understand, language is an important characteristic that can recognise humans and other animals. Language is a base function for logical thinking. Most human knowledge and techniques are recorded by language for next generations learning. [20] Therefore, the key element to achieve artificial intelligence is to establish a language processing system for machines in order to communicate more effectively. [19] However, it is quite difficult to establish a language system for a non-consciousness machine. The existing problems can be concluded in two aspects: the first one is that the existing natural language processing system grammars are limited to analysing single sentences instead of analysing connections between contexts and communication environment influences and constraints on a specific sentence. Thus, analysing ambiguity words, omitted words, and communication environment effects cannot be monitored currently. [21] The second aspect is that humans are able to extract relevant knowledge and information in the brain when they need to comprehend a single sentence. However, machines cannot record homogenous data as humans’ brains even though computers have expanded storage to a high level [21].

Natural language processing basic theories are automata, formal logic, and machine learning methods. The primary language resources are dictionaries and language databases. It is necessary for machines accepting relevant communications and languages between humans in both ordinary life and professional conditions [21]. Main techniques of natural language processing are related to machine learning and data mining algorithms of different sorts, such as extraction, pre-processing, classification, statistics, and clustering. [20] Additionally, techniques on grammar analysis of both phrases and sentences, techniques of text generation, and voice recognition are related to NLP as well. Implementations of natural language processing are quite wide. There are text information classification and clustering, information search and filtering, machine translation, information extraction, automatic answer systems, and fresh information evaluation systems.

Even though natural language processing techniques and theories have developed rapidly, there are still limitations in this domain. In order to solve complex problems in natural language processing, new and more advanced processing techniques and theories should be developed. [22] The essential role of natural language processing will affect the improvement of artificial intelligence and computer science. It will boost communication efficiency between humans and machines.

3. Game Theory Model Representation

As mentioned before, it is difficult to extract data relations from business information directly; alternatively, the proposed approach demonstrates the relationship by presenting the relationship among factors in various game theory models. If the relationship of different factors is deduced, it is possible to extract the data relations by mapping game theory factors to players, strategies, payoffs, and orders.

Consequently, the Game Theory Data Relations Generator (GTDRG) framework is presented in Figure 1 to demonstrate the key features and functions that contribute to data labelling through the semantic labeller, knowledge discovery by data mining, and mapping of game theory by relation mapping. The user interface contains both input and output of GTDRG, and input has three types of data sources: persistent data, cloud data, and dynamic data. The final output of GTDRG will be the relation graph or complete game theory cases with factors located in their positions.

Figure 1.

Figure 1.   Game theory data relations generator


The relations between Internet of things services are cooperative. Accordingly, one of the most important properties of the Internet of things is cooperative. Nowadays, due to the increasingly complicated societal requirements, it is no longer enough to accomplish one task by a single software entity. With the convenience supplied by the Internet, various applications distributed online can now be associated, combined, or united together to address complex difficulties. Therefore, the Internet of things should be able to provide a cooperating platform or mechanism for the game theory model to act on the communication data between different applications by eliciting data relations.

Furthermore, no matter what the various Internet of things services can provide to or require from each other, the ultimate aim of their cooperation is to assemble and organise distributed software entities to meet specific user demands dynamically. Therefore, the focus of the cooperation is to establish communication between different software entities. In order to do so, it is vital to compose appropriate interacting rules or principles to restrain the operations of the software entities. It is where the creative rules of the game theory model should be embedded. Based on the technologies of Semantic Web, it is possible to combine with the creative rules formed by surprising knowledge from other fields by manipulating the data relations between different applications.

The Internet of things services are not replaceable. Firstly, the Internet has been the most fundamental technology infrastructure. There have been no distinct differences between hardware, software, and Web. Secondly, as the modern Internet environment has become increasingly opening, dynamic, and uncontrolled, it is necessary to consider having new technology to adapt to the change. Thirdly and most importantly, as discussed before, it is time for various computing systems to “unite and conquer” to serve the world better. The six crucial properties of the Internet of things are able to ensure a more intelligent platform for GTDRG to perform.

The interactions between Internet of things are both simultaneous and sequential. According to the descriptions of the Internet of things properties, “autonomous” means software entities will participate in the cooperation on-demand with each other. Therefore, it is a kind of demand-driven operation. In other words, if the situation requires various software entities to act at the same time, the interactions would be simultaneous. Otherwise, if a software entity’s operation depends on other operation results, the interactions might be sequential.

It is not necessary to discuss whether the information about the environment is perfect or not for the Internet of things. The two vital properties “autonomous” and “situational” will ensure the Internet of things has necessary and comprehensive information to adapt to the change of environment. According to the previous discussion, the operations of the Internet of things are demand-driven. All necessary information would be collected and analysed about the changing environment. The “situational” property requires both software entities and operating platforms to be able to expose their runtime states and behaviours. Therefore, a comprehensive point of view on the opening environment would be provided for the Internet of things.

There may be some overlap between the Internet of things and artificial intelligence (i.e., AI). However, the aim of the Internet of things is not to imitate human intelligence, but to adapt to the opening, dynamic, and uncontrolled Internet environment. The interactions between various software entities are based on composed rules or models. It may be possible that more intelligent rules or principles could be embedded based on mechanisms of Semantic Web, which is one of the most essential AI technologies.

4. Mapping Processes and Rules

4.1. Types of Data Sources

There are three types of data sources that need to be treated respectively, and they need to be processed following SSL rules to secure the privacy for consumers. Data sources include persistent data, cloud data, and dynamic data.

·Persistent Data: Persistent data mainly indicates structured data such as documents, files, or Related Database Management System (RDBMS).

·Cloud Data: Cloud data mainly indicates semi-structured data such as Web-based information in Hypertext Markup Language (HTML). Semi-structured data is a sort of structured data that is not unanimous with the formal structure data models accompanied with relational databases or other sorts of data tables, for example, XML (Extensive Markup Language). Semi-structured data has an enormous alteration of figure structures [23]. In the information era, an ocean of traditional structured data can be found through the Web of Internet.

·Dynamic Data: The main difference between big data and traditional databases is the ability to process dynamic data. However, in order to achieve autonomous, cooperative, situational, and evolvable, it is indispensable to process dynamic data in complex business situations, because the data relations are changing in running business environments.

The presentation of comprehensive values of big data requires great synergy of multi-techniques. Foundation storage ability support can be provided by the data source files system. Data storage should be set for knowledge management. The analogy in a real world may be the motorway as represented in cloud computing for big data. Because large volumes of data exist, cloud computing contributes to data storage, management, and analysis.

4.2. Semantic Labeller Algorithms

For cleansing and extracting relations among various data, it is indispensable to discover semantic connections between entities. In text, pairs of entities are usually examined in a document to determine whether connections exist between them. Conventional approaches such as pattern matching, kernel methods, logistic regression, and augmented parsing are evaluated and applied to label the data. In order to demonstrate the semantic labeller appropriately, a brief semantic labeller algorithm is illustrated.

The whole algorithm is a recursive approach in Figure 2, and each unit of data will be labelled multiple times to describe facts with accuracy.

Figure 2.

Figure 2.   Semantic labeller algorithm


There are five steps to label databases in the Semantic Labeller Algorithm:

·Input data from the database by using the Related Database Management System (RDBMS)

·For the data selection parameter, in order to extract exact labels from data, it is crucial to filter data by applying different parameter settings. There will be several pre-set parameter settings to cover time integrity and spatial integrity. The data will be filtered continuously until all parameter settings have been processed once.

·The algorithm requires RDF data to extract labels in the next step; therefore, it is indispensable to convert source data into RDF data first, and data needs to be tabulated and cleaned if necessary.

·After RDF data conversion, the following step will be to extract labels from RDF data and accord labels with source data.

·The database will be updated dynamically into the labelled database.

4.3. Relation Discovery Techniques

Proposed research regarded text mining as a normalised procedure with knowledge discovery in database (KDD) [18]. Some scholars defined text mining as one of the steps of KDD. The main processing steps of text mining are:

·Text pre-processing: To pick up texts that are relevant to the mission and transfer target texts into formation, which can be recognised and processed by text mining tools.

·Text mining: It is able to use algorithms of machine learning, data mining, and recognition models to extract specific knowledge and models that are relevant to particular implementation targets after the text pre-processing.

·Model assessment: This is the last step, which means utilising well-defined assessment specifications to evaluate obtained knowledge and models. If the results of the assessment match the specifications, the model will be recorded preparing for future implementation. On the other hand, if the result cannot find a specification to match, then the system will readjust and revise the processing [24].

The main implementation aspects of knowledge discovery can be concluded into four parts:

· Searching engine (content-based).

· Implementation of information automatically classification, abstraction, and filtering.

·Automatically information extraction based on semantic labels from the previous step.

·Machine translation and automatically answering, which need more techniques for the natural language processing model [19].

Mapping method is applied in GTDRG for functional prediction. After knowledge discovery from the previous process, the relationships of labels should be mapped into the game theory model to demonstrate the relations between label. Once the data is allocated into the game theory model, the relations will be extracted directly from the relations of factors. As a result, mapped game theory cases will be presented to users at the consumer interface.

Alternatively, the labelled data could be applied into value stream mapping, which could analyse current situations and design the future state.

5. Case Study of Crowdsourcing Mechanism Design

There are many games and models in game theory, and it is easy to comprehend the relationships between game theory and factor relations through Nash equilibrium. Nash equilibrium is a concept that in a game consisting of players, an action profile of each player, and a payoff function for each one, no individual player can gain a higher payoff by distinguishing singly from his or her profiles [15]. This concept is widely used in predicting the outcome of the strategic interaction in the social sciences. In this paper, we take two example cases for the Internet of things applications and use them to illustrate and validate the proposed GTDRG approach.

As an example, the crowdsourcing design problem can be mapped into a game model design in real life. In the large-scale crowdsourcing markets such as Amazon’s Mechanical Turk, ClickWorker, and CrowdFlower, the game conditions include:

·There is a requester, who wants to hire workers to accomplish given tasks;

·Each worker is assumed to give some utility to the requester on getting hired;

·Each worker has a minimum cost that he wants to get paid for getting hired, which is private information of the workers;

·Given the limited budget of the requestor, how can one pick the right set of workers to hire, to maximize the requestor’s utility?

Once GTDRG receives these conditions, the following relations are extracted to form a table 1. The playoffs would become the relation between the players while they are mapped into the game theory model. The results would be experimented in a simulation environment, such as the one in Figure 3.

Table 1.   Mapping of crowdsourcing in the game-based model

New window| CSV


Figure 3.

Figure 3.   Simulate a crowdsourcing game model (in Netlogo)


The data relations in the game theory model is the model that is interpretable to further game simulators, which can be encoded using RDF or any other script-based notations. Here is an example:

<rdf: RDF

xmlns: "rdf=http://www.creacomp.org/01-rdf-syntax-ns#"

xmlns:feature="http://www.creacomp.org/linkeddatatools/task-features#">

<rdf:Description rdf:about="http://www.creacomp.org/linkeddatatools/task#text-mining">

<feature:difficulty-level>2</feature:difficulty-level>

<feature:status> rdf:resource="http:// http://www.creacomp.org/linkeddatatools/status#open"/>

</rdf:Description>

</rdf:RDF>

“xmlns:"rdf=http://www.creacomp.org/01-rdf-syntax-ns#"” makes machine recognise that the enclosing document is an RDF data based document, the tag “rdf:Description” makes machine understand that the operator is going to describe a subject, “task”. In RDF terminology, a statement such as “<feature:difficulty-level>2</feature:difficulty-level>” means that the difficulty level of the task is 2. And “rdf:resource="http:// http://www.creacomp.org/ linkeddatatools/status#open"/>” indicates that the objects in RDF is able to reference the subjects of other statement.

6. Conclusions

This paper aims at creating a novel approach, following creative computing guidance, to construct and apply the game theory and software-based knowledge in the business field. This proposed research illustrated that creative computing, Internet of things, and big data concepts can be used as a basement conceptual demonstration. The contribution of the research, a Game Theory Data Relations Generator (GTDRG), is described and demonstrated after the background is illustrated. Using Semantic Web and data mining as kernel techniques to build the system, the research has a detailed explanation on the function and processing principles. In the system operation process, the core algorithm is the Semantic Labeller Algorithm. This algorithm has the purpose of problem-solving for label extraction in order to generate appropriate data relations. The algorithm can achieve a data conversion between tabular data and RDF data and then map labels in the game theory model. It expands the storage volume of the information in organisations’ systems and improves the accuracy of data combination through knowledge combination and cloud computing. However, there still exist some problems, especially the accuracy and information security problem, during the operation period of the GTDRG. It is essential to preserve the security of private information of organisations; thus, we plan to develop new algorithms to revise and solve the problems. At the same time, we will develop further studies and research on such problems in order to consummate GTDRG.

Reference

A. Hugill and H. Yang,

“The Creative Turn: New Challenges for Computing,”

International Journal of Creative Computing, Vol. 1, No. 1, pp.4-19, 2013

[Cited within: 1]

C. Bizer, T. Heath, T. Berners-Lee,

“Linked Data-The Story So Far,”

Semantic Services, Interoperability and Web Applications: Emerging Concepts, pp.205-227, 2009

C. Bizer,

“The Emerging Web of Linked Data,”

IEEE Intelligent Systems, Vol. 24, pp.87-92, 2009

C. Bizer, R. Cyganiak, T. Heath,

“How to Publish Linked Data on The Web,”

Karlsruhe, 2008

[Cited within: 1]

P. Mendes, M. Jakob, A. García-Silva, C. Bizer,

“DBpedia Spotlight: Shedding Light on the Web of Documents,”

in Proceedings of the 7th International Conference on Semantic Systems, pp.1-8, New York, NY, USA, 2011

DOI:10.1145/2063518.2063519      URL     [Cited within: 1]

Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs. DBpedia Spotlight allows users to configure the annotations to their specific needs through the DBpedia Ontology and quality measures such as prominence, topical pertinence, contextual ambiguity and disambiguation confidence. We compare our approach with the state of the art in disambiguation, and evaluate our results in light of three baselines and six publicly available annotation systems, demonstrating the competitiveness of our system. DBpedia Spotlight is shared as open source and deployed as a Web Service freely available for public use.

T. Heath and C. Bizer,

“Linked Data: Evolving the Web into A Global Data Space,”

Synthesis Lectures on the Semantic Web: Theory and Technology, Morgan & Claypool, Vol. 1, pp.1-136, 2011

[Cited within: 2]

C. Bizer, T. Heath, K. Idehen, T. Berners-Lee,

“Linked Data on the Web (LDOW2008),”

in Proceedings of the 17th International Conference on World Wide Web, pp.1265-1266, ACM, New York, NY, USA, 2008

[Cited within: 1]

D. Lathrop and L. Ruma,

“Open Government: Collaboration, Transparency, and Participation in Practice,”

O'Reilly Media Inc., 2010

[Cited within: 1]

F. Bauer and M. Kaltenböck,

“Linked Open Data: The Essentials,”

Edition Mono/Monochrome, Vienna, 2011

[Cited within: 1]

T. Berners-Lee, J. Hendler, O. Lassila,

“The Semantic Web”

, Scientific American, pp.34-43, 2011

[Cited within: 1]

T. Berners-Lee, Y. Chen, L. Chilton, D. Connolly, R. Dhanaraj, J. Hollenbach, et al.,

“Tabulator: Exploring and Analyzing Linked Data on the Semantic Web,”

in Proceedings of the 3rd International Semantic Web User Interaction Workshop, MIT Cambridge, MA, 2006

[Cited within: 1]

J. Volz, C. Bizer, M. Gaedke, G. Kobilarov,

“Discovering and Maintaining Relations on the Web of Data,”

in Proceedings of the 8th International Semantic Web Conference, pp.650-665, Springer-Verlag, Berlin, Heidelberg, 2009

[Cited within: 1]

C. Bizer, T. Heath, T. Berners-Lee,

“Linked Data: Principles and State of the Art,”

in Proceedings of the 17th International World Wide Web Conference W3C Track @WWW2008, pp.1-40, Beijing, China, 2008

[Cited within: 1]

D. Allemang and J. Hendler,

“Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL, ”

Morgan Kaufmann, 2011

[Cited within: 1]

C. Bizer, J. Lehmann, G. Kobilarov, S. Auer, C. SBecker, R. Cyganiak, et al.,

“DBpedia - A crystallization point for the Web of Data”

, Web Semant, pp.154-165, 2009

[Cited within: 2]

S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, Z. Ives,

“Dbpedia: A Nucleus for a Web of Open Data”

in Proceedings of the Semantic Web and 2nd Asian Conference on Asian Semantic Web Conference, pp.722-735, 2007

DOI:10.1007/978-3-540-76298-0_52      URL     [Cited within: 1]

DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against datasets derived from Wikipedia and to link other datasets on the Web to Wikipedia data. We describe the extraction of the DBpedia datasets, and how the resulting information is published on the Web for human- and machine-consumption. We describe some emerging applications from the DBpedia community and show how website authors can facilitate DBpedia content within their sites. Finally, we present the current status of interlinking DBpedia with other open datasets on the Web and outline how DBpedia could serve as a nucleus for an emerging Web of open data.

V. Verma, M. Ranjan, P. Mishra,

“Text Mining and Information Professionals: Role, Issues and Challenges,”

in Proceedings of 2015 4 th International Symposium on Emerging Trends and Technologies in Libraries and Information Services, pp.133-137, 2015

DOI:10.1109/ETTLIS.2015.7048186      URL     [Cited within: 3]

Information explosion and availability of information in various forms has changed the shape of information centres and nature of information profession and professionals. Information profession and professionals have been impacted by the exponentially increasing volumes of information available - as well as with changing attitudes and behaviour of information seeker toward electronic resources. It is very difficult to find the required pieces of information in the bundled of scattered information. The task becomes more challenging, when we find that over 90% of the information available is in unstructured and semi-structured forms, which is very difficult to search. Here text mining has come as a tool to help Information professionals to find the relevant information and deliver to its users. Text mining is used as a technology for analyzing large volumes of structured/unstructured textual documents. Text mining has very high knowledge and commercial values. The aim of Text mining is generally to strengthen decision making and internal operations processes of any organisation and generation of new domain of knowledge. These technologies help to increase the utilization of Knowledge Management (KM) systems and pro-actively help information professionals to improve their competencies and thus productivity of the organization. This article discusses the basic concept of text mining, its framework and text mining products and tools. It also discusses the benefits and challenges of text mining and examines the role of Information Professionals in Text Mining.

D. Sánchez, M. J. Martín-Bautista, I. Blanco, C. Justicia de la Torre,

“Text Knowledge Mining: An Alternative to Text Data Mining, ”

in Proceedings of 2008 IEEE International Conference on Data Mining Workshops, pp.664-672, 2008

[Cited within: 4]

E. Cambria and B. White,

“Jumping NLP Curves: A Review of Natural Language Processing Research,”

IEEE Computational Intelligence Magazine, Vol. 9, No. 2, pp.48-57, 2014

[Cited within: 3]

M. Carenini, A. Whyte, L. Bertorello, M. Vanocchi,

“Improving Communication in E-democracy using Natural Language Processing”

, IEEE Intelligent Systems, Vol. 22, No. 1, pp.20-27, 2007

DOI:10.1109/MIS.2007.11      URL     [Cited within: 2]

E-democracy, the design and development of new techniques for improving communication between public administration and citizens, is a major application field for natural language processing and language engineering. Helping citizens access information in a friendly, intuitive way is the primary objective of a global e-democracy framework. The E-democracy European Network project (EDEN) aimed at discovering whether a particular NLP (natural language processing) approach could further e-democracy by increasing citizens' participation in the decision-making process. The goal was twofold: to test whether e-democracy requirements could be meet using advanced linguistic technology and to test whether augmented phrase structure grammars (APSGs) were robust and well-assessed enough to use in a real-world environment. Also, the aim is to develop two toolsets to improve communication between PAs and citizens in the context of urban planning: a set of NLP-based tools to simplify access to information and knowledge and a set of forum and polling devices

P. Y. Song, A. H. Shu, D. Phipps, M. Tiwari, D. S. Wallach, J. R. Crandall, et al.,

“Language without Words: A Pointillist Model for Natural Language Processing,”

in Proceedings of the 6th International Conference on Soft Computing and Intelligent Systems and the 13th International Symposium on Advanced Intelligent System, 2012

[Cited within: 3]

T. M.Michael and N. G. Bourbakis ,

“Graph-based Methods for Natural Language Processing and Understanding—A Survey and Analysis,”

IEEE Transactions on Systems, Man, and Cybernetics: Systems, Vol. 44, No. 1, pp.59-71, 2014

[Cited within: 1]

L. Liu and M. T. Özsu,

Encyclopedia of Database Systems, ”

Springer, 2009

[Cited within: 1]

M. Sukanya and S. Biruntha,

“Techniques on Text Mining, ”

in Proceedings of 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies, pp.269-271, 2012

[Cited within: 1]

/