Monday, January 27, 2020
Identifying Clusters in High Dimensional Data
Identifying Clusters in High Dimensional Data ââ¬Å"Ask those who remember, are mindful if you do not know).â⬠(Holy Quran, 6:43) Removal Of Redundant Dimensions To Find Clusters In N-Dimensional Data Using Subspace Clustering Abstract The data mining has emerged as a powerful tool to extract knowledge from huge databases. Researchers have introduced several machine learning algorithms to explore the databases to discover information, hidden patterns, and rules from the data which were not known at the data recording time. Due to the remarkable developments in the storage capacities, processing and powerful algorithmic tools, practitioners are developing new and improved algorithms and techniques in several areas of data mining to discover the rules and relationship among the attributes in simple and complex higher dimensional databases. Furthermore data mining has its implementation in large variety of areas ranging from banking to marketing, engineering to bioinformatics and from investment to risk analysis and fraud detection. Practitioners are analyzing and implementing the techniques of artificial neural networks for classification and regression problems because of accuracy, efficiency. The aim of his short r esearch project is to develop a way of identifying the clusters in high dimensional data as well as redundant dimensions which can create a noise in identifying the clusters in high dimensional data. Techniques used in this project utilizes the strength of the projections of the data points along the dimensions to identify the intensity of projection along each dimension in order to find cluster and redundant dimension in high dimensional data. 1 Introduction In numerous scientific settings, engineering processes, and business applications ranging from experimental sensor data and process control data to telecommunication traffic observation and financial transaction monitoring, huge amounts of high-dimensional measurement data are produced and stored. Whereas sensor equipments as well as big storage devices are getting cheaper day by day, data analysis tools and techniques wrap behind. Clustering methods are common solutions to unsupervised learning problems where neither any expert knowledge nor some helpful annotation for the data is available. In general, clustering groups the data objects in a way that similar objects get together in clusters whereas objects from different clusters are of high dissimilarity. However it is observed that clustering disclose almost no structure even it is known there must be groups of similar objects. In many cases, the reason is that the cluster structure is stimulated by some subsets of the spaces dim ensions only, and the many additional dimensions contribute nothing other than making noise in the data that hinder the discovery of the clusters within that data. As a solution to this problem, clustering algorithms are applied to the relevant subspaces only. Immediately, the new question is how to determine the relevant subspaces among the dimensions of the full space. Being faced with the power set of the set of dimensions a brute force trial of all subsets is infeasible due to their exponential number with respect to the original dimensionality. In high dimensional data, as dimensions are increasing, the visualization and representation of the data becomes more difficult and sometimes increase in the dimensions can create a bottleneck. More dimensions mean more visualization or representation problems in the data. As the dimensions are increased, the data within those dimensions seems dispersing towards the corners / dimensions. Subspace clustering solves this problem by identifying both problems in parallel. It solves the problem of relevant subspaces which can be marked as redundant in high dimensional data. It also solves the problem of finding the cluster structures within that dataset which become apparent in these subspaces. Subspace clustering is an extension to the traditional clustering which automatically finds the clusters present in the subspace of high dimensional data space that allows better clustering the data points than the original space and it works even when the curse of dimensionality occurs. The most o f the clustering algorithms have been designed to discover clusters in full dimensional space so they are not effective in identifying the clusters that exists within subspace of the original data space. The most of the clustering algorithms produces clustering results based on the order in which the input records were processed [2]. Subspace clustering can identify the different cluster within subspaces which exists in the huge amount of sales data and through it we can find which of the different attributes are related. This can be useful in promoting the sales and in planning the inventory levels of different products. It can be used for finding the subspace clusters in spatial databases and some useful decisions can be taken based on the subspace clusters identified [2]. The technique used here for indentifying the redundant dimensions which are creating noise in the data in order to identifying the clusters consist of drawing or plotting the data points in all dimensions. At second step the projection of all data points along each dimension are plotted. At the third step the unions of projections along each dimension are plotted using all possible combinations among all no. of dimensions and finally the union of all projection along all dimensions and analyzed, it will show the contribution of each dimension in indentifying the cluster which will be represented by the weight of projection. If any of the given dimension is contributing very less in order to building the weight of projection, that dimension can be considered as redundant, which means this dimension is not so important to identify the clusters in given data. The details of this strategy will be covered in later chapters. 2 Data Mining 2.1 What is Data Mining? Data mining is the process of analyzing data from different perspective and summarizing it for getting useful information. The information can be used for many useful purposes like increasing revenue, cuts costs etc. The data mining process also finds the hidden knowledge and relationship within the data which was not known while data recording. Describing the data is the first step in data mining, followed by summarizing its attributes (like standard deviation mean etc). After that data is reviewed using visual tools like charts and graphs and then meaningful relations are determined. In the data mining process, the steps of collecting, exploring and selecting the right data are critically important. User can analyze data from different dimensions categorize and summarize it. Data mining finds the correlation or patterns amongst the fields in large databases. Data mining has a great potential to help companies to focus on their important information in their data warehouse. It can predict the future trends and behaviors and allows the business to make more proactive and knowledge driven decisions. It can answer the business questions that were traditionally much time consuming to resolve. It scours databases for hidden patterns for finding predictive information that experts may miss it might lies beyond their expectations. Data mining is normally used to transform the data into information or knowledge. It is commonly used in wide range of profiting practices such as marketing, fraud detection and scientific discovery. Many companies already collect and refine their data. Data mining techniques can be implemented on existing platforms for enhance the value of information resources. Data mining tools can analyze massive databases to deliver answers to the questions. Some other terms contains similar meaning from data mining such as ââ¬Å"Knowledge miningâ⬠or ââ¬Å"Knowledge Extractionâ⬠or ââ¬Å"Pattern Analysisâ⬠. Data mining can also be treated as a Knowledge Discovery from Data (KDD). Some people simply mean the data mining as an essential step in Knowledge discovery from a large data. The process of knowledge discovery from data contains following steps. * Data cleaning (removing the noise and inconsistent data) * Data Integration (combining multiple data sources) * Data selection (retrieving the data relevant to analysis task from database) * Data Transformation (transforming the data into appropriate forms for mining by performing summary or aggregation operations) * Data mining (applying the intelligent methods in order to extract data patterns) * Pattern evaluation (identifying the truly interesting patterns representing knowledge based on some measures) * Knowledge representation (representing knowledge techniques that are used to present the mined knowledge to the user) 2.2 Data Data can be any type of facts, or text, or image or number which can be processed by computer. Todays organizations are accumulating large and growing amounts of data in different formats and in different databases. It can include operational or transactional data which includes costs, sales, inventory, payroll and accounting. It can also include nonoperational data such as industry sales and forecast data. It can also include the meta data which is, data about the data itself, such as logical database design and data dictionary definitions. 2.3 Information The information can be retrieved from the data via patterns, associations or relationship may exist in the data. For example the retail point of sale transaction data can be analyzed to yield information about the products which are being sold and when. 2.4 Knowledge Knowledge can be retrieved from information via historical patterns and the future trends. For example the analysis on retail supermarket sales data in promotional efforts point of view can provide the knowledge buying behavior of customer. Hence items which are at most risk for promotional efforts can be determined by manufacturer easily. 2.5 Data warehouse The advancement in data capture, processing power, data transmission and storage technologies are enabling the industry to integrate their various databases into data warehouse. The process of centralizing and retrieving the data is called data warehousing. Data warehousing is new term but concept is a bit old. Data warehouse is storage of massive amount of data in electronic form. Data warehousing is used to represent an ideal way of maintaining a central repository for all organizational data. Purpose of data warehouse is to maximize the user access and analysis. The data from different data sources are extracted, transformed and then loaded into data warehouse. Users / clients can generate different types of reports and can do business analysis by accessing the data warehouse. Data mining is primarily used today by companies with a strong consumer focus retail, financial, communication, and marketing organizations. It allows these organizations to evaluate associations between certain internal external factors. The product positioning, price or staff skills can be example of internal factors. The external factor examples can be economic indicators, customer demographics and competition. It also allows them to calculate the impact on sales, corporate profits and customer satisfaction. Furthermore it allows them to summarize the information to look detailed transactional data. Given databases of sufficient size and quality, data mining technology can generate new business opportunities by its capabilities. Data mining usually automates the procedure of searching predictive information in huge databases. Questions that traditionally required extensive hands-on analysis can now be answered directly from the data very quickly. The targeted marketing can be an example of predictive problem. Data mining utilizes data on previous promotional mailings in order to recognize the targets most probably to increase return on investment as maximum as possible in future mailings. Tools used in data mining traverses through huge databases and discover previously unseen patterns in single step. Analysis on retail sales data to recognize apparently unrelated products which are usually purchased together can be an example of it. The more pattern discovery problems can include identifying fraudulent credit card transactions and identifying irregular data that could symbolize data entry input errors. When data mining tools are used on parallel processing systems of high performance, they are able to analy ze huge databases in very less amount of time. Faster or quick processing means that users can automatically experience with more details to recognize the complex data. High speed and quick response makes it actually possible for users to examine huge amounts of data. Huge databases, in turn, give improved and better predictions. 2.6 Descriptive and Predictive Data Mining Descriptive data mining aims to find patterns in the data that provide some information about what the data contains. It describes patterns in existing data, and is generally used to create meaningful subgroups such as demographic clusters. For example descriptions are in the form of Summaries and visualization, Clustering and Link Analysis. Predictive Data Mining is used to forecast explicit values, based on patterns determined from known results. For example, in the database having records of clients who have already answered to a specific offer, a model can be made that predicts which prospects are most probable to answer to the same offer. It is usually applied to recognize data mining projects with the goal to identify a statistical or neural network model or set of models that can be used to predict some response of interest. For example, a credit card company may want to engage in predictive data mining, to derive a (trained) model or set of models that can quickly identify tr ansactions which have a high probability of being fraudulent. Other types of data mining projects may be more exploratory in nature (e.g. to determine the cluster or divisions of customers), in which case drill-down descriptive and tentative methods need to be applied. Predictive data mining is goad oriented. It can be decomposed into following major tasks. * Data Preparation * Data Reduction * Data Modeling and Prediction * Case and Solution Analysis 2.7 Text Mining The Text Mining is sometimes also called Text Data Mining which is more or less equal to Text Analytics. Text mining is the process of extracting/deriving high quality information from the text. High quality information is typically derived from deriving the patterns and trends through means such as statistical pattern learning. It usually involves the process of structuring the input text (usually parsing, along with the addition of some derived linguistic features and the removal of others, and subsequent insertion into a database), deriving patterns within the structured data, and finally evaluation and interpretation of the output. The High Quality in text mining usually refers to some combination of relevance, novelty, and interestingness. The text categorization, concept/entity extraction, text clustering, sentiment analysis, production of rough taxonomies, entity relation modeling, document summarization can be included as text mining tasks. Text Mining is also known as the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources. Linking together of the extracted information is the key element to create new facts or new hypotheses to be examined further by more conventional ways of experimentation. In text mining, the goal is to discover unknown information, something that no one yet knows and so could not have yet written down. The difference between ordinary data mining and text mining is that, in text mining the patterns are retrieved from natural language text instead of from structured databases of facts. Databases are designed and developed for programs to execute automatically; text is written for people to read. Most of the researchers think that it will need a full fledge simulation of how the brain works before that programs that read the way people do could be written. 2.8 Web Mining Web Mining is the technique which is used to extract and discover the information from web documents and services automatically. The interest of various research communities, tremendous growth of information resources on Web and recent interest in e-commerce has made this area of research very huge. Web mining can be usually decomposed into subtasks. * Resource finding: fetching intended web documents. * Information selection and pre-processing: selecting and preprocessing specific information from fetched web resources automatically. * Generalization: automatically discovers general patterns at individual and across multiple website * Analysis: validation and explanation of mined patterns. Web Mining can be mainly categorized into three areas of interest based on which part of Web needs to be mined: Web Content Mining, Web Structure Mining and Web Usage Mining. Web Contents Mining describes the discovery of useful information from the web contents, data and documents [10]. In past the internet consisted of only different types of services and data resources. But today most of the data is available over the internet; even digital libraries are also available on Web. The web contents consist of several types of data including text, image, audio, video, metadata as well as hyperlinks. Most of the companies are trying to transform their business and services into electronic form and putting it on Web. As a result, the databases of the companies which were previously residing on legacy systems are now accessible over the Web. Thus the employees, business partners and even end clients are able to access the companys databases over the Web. Users are accessing the application s over the web via their web interfaces due to which the most of the companies are trying to transform their business over the web, because internet is capable of making connection to any other computer anywhere in the world [11]. Some of the web contents are hidden and hence cannot be indexed. The dynamically generated data from the results of queries residing in the database or private data can fall in this area. Unstructured data such as free text or semi structured data such as HTML and fully structured data such as data in the tables or database generated web pages can be considered in this category. However unstructured text is mostly found in the web contents. The work on Web content mining is mostly done from 2 point of views, one is IR and other is DB point of view. ââ¬Å"From IR view, web content mining assists and improves the information finding or filtering to the user. From DB view web content mining models the data on the web and integrates them so that the more soph isticated queries other than keywords could be performed. [10]. In Web Structure Mining, we are more concerned with the structure of hyperlinks within the web itself which can be called as inter document structure [10]. It is closely related to the web usage mining [14]. Pattern detection and graphs mining are essentially related to the web structure mining. Link analysis technique can be used to determine the patterns in the graph. The search engines like Google usually uses the web structure mining. For example, the links are mined and one can then determine the web pages that point to a particular web page. When a string is searched, a webpage having most number of links pointed to it may become first in the list. Thats why web pages are listed based on rank which is calculated by the rank of web pages pointed to it [14]. Based on web structural data, web structure mining can be divided into two categories. The first kind of web structure mining interacts with extracting patterns from the hyperlinks in the web. A hyperlink is a structural comp onent that links or connects the web page to a different web page or different location. The other kind of the web structure mining interacts with the document structure, which is using the tree-like structure to analyze and describe the HTML or XML tags within the web pages. With continuous growth of e-commerce, web services and web applications, the volume of clickstream and user data collected by web based organizations in their daily operations has increased. The organizations can analyze such data to determine the life time value of clients, design cross marketing strategies etc. [13]. The Web usage mining interacts with data generated by users clickstream. ââ¬Å"The web usage data includes web server access logs, proxy server logs, browser logs, user profile, registration data, user sessions, transactions, cookies, user queries, bookmark data, mouse clicks and scrolls and any other data as a result of interactionâ⬠[10]. So the web usage mining is the most important task of the web mining [12]. Weblog databases can provide rich information about the web dynamics. In web usage mining, web log records are mined to discover the user access patterns through which the potential customers can be identified, quality of internet services can be enhanc ed and web server performance can be improved. Many techniques can be developed for implementation of web usage mining but it is important to know that success of such applications depends upon what and how much valid and reliable knowledge can be discovered the log data. Most often, the web logs are cleaned, condensed and transformed before extraction of any useful and significant information from weblog. Web mining can be performed on web log records to find associations patterns, sequential patterns and trend of web accessing. The overall Web usage mining process can be divided into three inter-dependent stages: data collection and pre-processing, pattern discovery, and pattern analysis [13]. In the data collection preprocessing stage, the raw data is collected, cleaned and transformed into a set of user transactions which represents the activities of each user during visits to the web site. In the pattern discovery stage, statistical, database, and machine learning operations a re performed to retrieve hidden patterns representing the typical behavior of users, as well as summary of statistics on Web resources, sessions, and users. 3 Classification 3.1 What is Classification? As the quantity and the variety increases in the available data, it needs some robust, efficient and versatile data categorization technique for exploration [16]. Classification is a method of categorizing class labels to patterns. It is actually a data mining methodology used to predict group membership for data instances. For example, one may want to use classification to guess whether the weather on a specific day would be ââ¬Å"sunnyâ⬠, ââ¬Å"cloudyâ⬠or ââ¬Å"rainyâ⬠. The data mining techniques which are used to differentiate similar kind of data objects / points from other are called clustering. It actually uses attribute values found in the data of one class to distinguish it from other types or classes. The data classification majorly concerns with the treatment of the large datasets. In classification we build a model by analyzing the existing data, describing the characteristics of various classes of data. We can use this model to predict the class/type of new data. Classification is a supervised machine learning procedure in which individual items are placed in a group based on quantitative information on one or more characteristics in the items. Decision Trees and Bayesian Networks are the examples of classification methods. One type of classification is Clustering. This is process of finding the similar data objects / points within the given dataset. This similarity can be in the meaning of distance measures or on any other parameter, depending upon the need and the given data. Classification is an ancient term as well as a modern one since classification of animals, plants and other physical objects is still valid today. Classification is a way of thinking about things rather than a study of things itself so it draws its theory and application from complete range of human experiences and thoughts [18]. From a bigger picture, classification can include medical patients based on disease, a set of images containing red rose from an image database, a set of documents describing ââ¬Å"classificationâ⬠from a document/text database, equipment malfunction based on cause and loan applicants based on their likelihood of payment etc. For example in later case, the problem is to predict a new applicants loans eligibility given old data about customers. There are many techniques which are used for data categorization / classification. The most common are Decision tree classifier and Bayesian classifiers. 3.2 Types of Classification There are two types of classification. One is supervised classification and other is unsupervised classification. Supervised learning is a machine learning technique for discovering a function from training data. The training data contains the pairs of input objects, and their desired outputs. The output of the function can be a continuous value which can be called regression, or can predict a class label of the input object which can be called as classification. The task of the supervised learner is to predict the value of the function for any valid input object after having seen a number of training examples (i.e. pairs of input and target output). To achieve this goal, the learner needs to simplify from the presented data to hidden situations in a meaningful way. The unsupervised learning is a class of problems in machine learning in which it is needed to seek to determine how the data are organized. It is distinguished from supervised learning in that the learner is given only unknown examples. Unsupervised learning is nearly related to the problem of density estimation in statistics. However unsupervised learning also covers many other techniques that are used to summarize and explain key features of the data. One form of unsupervised learning is clustering which will be covered in next chapter. Blind source partition based on Independent Component Analysis is another example. Neural network models, adaptive resonance theory and the self organizing maps are most commonly used unsupervised learning algorithms. There are many techniques for the implementation of supervised classification. We will be discussing two of them which are most commonly used which are Decision Trees classifiers and Naà ¯ve Bayesian Classifiers. 3.2.1 Decision Trees Classifier There are many alternatives to represent classifiers. The decision tree is probably the most widely used approach for this purpose. It is one of the most widely used supervised learning methods used for data exploration. It is easy to use and can be represented in if-then-else statements/rules and can work well in noisy data as well [16]. Tree like graph or decisions models and their possible consequences including resource costs, chance event, outcomes, and utilities are used in decision trees. Decision trees are most commonly used in specifically in decision analysis, operations research, to help in identifying a strategy most probably to reach a target. In machine learning and data mining, a decision trees are used as predictive model; means a planning from observations calculations about an item to the conclusions about its target value. More descriptive names for such tree models are classification tree or regression tree. In these tree structures, leaves are representing class ifications and branches are representing conjunctions of features those lead to classifications. The machine learning technique for inducing a decision tree from data is called decision tree learning, or decision trees. Decision trees are simple but powerful form of multiple variable analyses [15]. Classification is done by tree like structures that have different test criteria for a variable at each of the nodes. New leaves are generated based on the results of the tests at the nodes. Decision Tree is a supervised learning system in which classification rules are constructed from the decision tree. Decision trees are produced by algorithms which identify various ways splitting data set into branch like segment. Decision tree try to find out a strong relationship between input and target values within the dataset [15]. In tasks classification, decision trees normally visualize that what steps should be taken to reach on classification. Every decision tree starts with a parent node called root node which is considered to be the parent of every other node. Each node in the tree calculates an attribute in the data and decides which path it should follow. Typically the decision test is comparison of a value against some constant. Classification with the help of decision tree is done by traversing from the root node up to a leaf node. Decision trees are able to represent and classify the diverse types of data. The simplest form of data is numerical data which is most familiar too. Organizing nominal data is also required many times in many situations. Nominal quantities are normally represented via discrete set of symbols. For example weather condition can be described in either nominal fashion or numeric. Quantification can be done about temperature by saying that it is eleven degrees Celsius or fifty two degrees Fahrenheit. The cool, mild, cold, warm or hot terminologies can also be sued. The former is a type of numeric data while and the latter is an example of nominal data. More precisely, the example of cool, mild, cold, warm and hot is a special type of nominal data, expressed as ordinal data. Ordinal data usually has an implicit assumption of ordered relationships among the values. In the weather example, purely nominal description like rainy, overcast and sunny can also be added. These values have no relationships or distance measures among each other. Decision Trees are those types of trees where each node is a question, each branch is an answer to a question, and each leaf is a result. Here is an example of Decision tree. Roughly, the idea is based upon the number of stock items; we have to make different decisions. If we dont have much, you buy at any cost. If you have a lot of items then you only buy if it is inexpensive. Now if stock items are less than 10 then buy all if unit price is less than 10 otherwise buy only 10 items. Now if we have 10 to 40 items in the stock then check unit price. If unit price is less than 5à £ then buy only 5 items otherwise no need to buy anything expensive since stock is good already. Now if we have more than 40 items in the stock, then buy 5 if and only if price is less than 2à £ otherwise no need to buy too expensive items. So in this way decision trees help us to make a decision at each level. Here is another example of decision tree, representing the risk factor associated with the rash driving. The root node at the top of the tree structure is showing the feature that is split first for highest discrimination. The internal nodes are showing decision rules on one or more attributes while leaf nodes are class labels. A person having age less than 20 has very high risk while a person having age greater than 30 has a very low risk. A middle category; a person having age greater than 20 but less than 30 depend upon another attribute which is car type. If car type is of sports then there is again high risk involved while if family car is used then there is low risk involved. In the field of sciences engineering and in the applied areas including business intelligence and data mining, many useful features are being introduced as the result of evolution of decision trees. * With the help of transformation in decision trees, the volume of data can be reduced into more compact form that preserves the major characteristic Identifying Clusters in High Dimensional Data Identifying Clusters in High Dimensional Data ââ¬Å"Ask those who remember, are mindful if you do not know).â⬠(Holy Quran, 6:43) Removal Of Redundant Dimensions To Find Clusters In N-Dimensional Data Using Subspace Clustering Abstract The data mining has emerged as a powerful tool to extract knowledge from huge databases. Researchers have introduced several machine learning algorithms to explore the databases to discover information, hidden patterns, and rules from the data which were not known at the data recording time. Due to the remarkable developments in the storage capacities, processing and powerful algorithmic tools, practitioners are developing new and improved algorithms and techniques in several areas of data mining to discover the rules and relationship among the attributes in simple and complex higher dimensional databases. Furthermore data mining has its implementation in large variety of areas ranging from banking to marketing, engineering to bioinformatics and from investment to risk analysis and fraud detection. Practitioners are analyzing and implementing the techniques of artificial neural networks for classification and regression problems because of accuracy, efficiency. The aim of his short r esearch project is to develop a way of identifying the clusters in high dimensional data as well as redundant dimensions which can create a noise in identifying the clusters in high dimensional data. Techniques used in this project utilizes the strength of the projections of the data points along the dimensions to identify the intensity of projection along each dimension in order to find cluster and redundant dimension in high dimensional data. 1 Introduction In numerous scientific settings, engineering processes, and business applications ranging from experimental sensor data and process control data to telecommunication traffic observation and financial transaction monitoring, huge amounts of high-dimensional measurement data are produced and stored. Whereas sensor equipments as well as big storage devices are getting cheaper day by day, data analysis tools and techniques wrap behind. Clustering methods are common solutions to unsupervised learning problems where neither any expert knowledge nor some helpful annotation for the data is available. In general, clustering groups the data objects in a way that similar objects get together in clusters whereas objects from different clusters are of high dissimilarity. However it is observed that clustering disclose almost no structure even it is known there must be groups of similar objects. In many cases, the reason is that the cluster structure is stimulated by some subsets of the spaces dim ensions only, and the many additional dimensions contribute nothing other than making noise in the data that hinder the discovery of the clusters within that data. As a solution to this problem, clustering algorithms are applied to the relevant subspaces only. Immediately, the new question is how to determine the relevant subspaces among the dimensions of the full space. Being faced with the power set of the set of dimensions a brute force trial of all subsets is infeasible due to their exponential number with respect to the original dimensionality. In high dimensional data, as dimensions are increasing, the visualization and representation of the data becomes more difficult and sometimes increase in the dimensions can create a bottleneck. More dimensions mean more visualization or representation problems in the data. As the dimensions are increased, the data within those dimensions seems dispersing towards the corners / dimensions. Subspace clustering solves this problem by identifying both problems in parallel. It solves the problem of relevant subspaces which can be marked as redundant in high dimensional data. It also solves the problem of finding the cluster structures within that dataset which become apparent in these subspaces. Subspace clustering is an extension to the traditional clustering which automatically finds the clusters present in the subspace of high dimensional data space that allows better clustering the data points than the original space and it works even when the curse of dimensionality occurs. The most o f the clustering algorithms have been designed to discover clusters in full dimensional space so they are not effective in identifying the clusters that exists within subspace of the original data space. The most of the clustering algorithms produces clustering results based on the order in which the input records were processed [2]. Subspace clustering can identify the different cluster within subspaces which exists in the huge amount of sales data and through it we can find which of the different attributes are related. This can be useful in promoting the sales and in planning the inventory levels of different products. It can be used for finding the subspace clusters in spatial databases and some useful decisions can be taken based on the subspace clusters identified [2]. The technique used here for indentifying the redundant dimensions which are creating noise in the data in order to identifying the clusters consist of drawing or plotting the data points in all dimensions. At second step the projection of all data points along each dimension are plotted. At the third step the unions of projections along each dimension are plotted using all possible combinations among all no. of dimensions and finally the union of all projection along all dimensions and analyzed, it will show the contribution of each dimension in indentifying the cluster which will be represented by the weight of projection. If any of the given dimension is contributing very less in order to building the weight of projection, that dimension can be considered as redundant, which means this dimension is not so important to identify the clusters in given data. The details of this strategy will be covered in later chapters. 2 Data Mining 2.1 What is Data Mining? Data mining is the process of analyzing data from different perspective and summarizing it for getting useful information. The information can be used for many useful purposes like increasing revenue, cuts costs etc. The data mining process also finds the hidden knowledge and relationship within the data which was not known while data recording. Describing the data is the first step in data mining, followed by summarizing its attributes (like standard deviation mean etc). After that data is reviewed using visual tools like charts and graphs and then meaningful relations are determined. In the data mining process, the steps of collecting, exploring and selecting the right data are critically important. User can analyze data from different dimensions categorize and summarize it. Data mining finds the correlation or patterns amongst the fields in large databases. Data mining has a great potential to help companies to focus on their important information in their data warehouse. It can predict the future trends and behaviors and allows the business to make more proactive and knowledge driven decisions. It can answer the business questions that were traditionally much time consuming to resolve. It scours databases for hidden patterns for finding predictive information that experts may miss it might lies beyond their expectations. Data mining is normally used to transform the data into information or knowledge. It is commonly used in wide range of profiting practices such as marketing, fraud detection and scientific discovery. Many companies already collect and refine their data. Data mining techniques can be implemented on existing platforms for enhance the value of information resources. Data mining tools can analyze massive databases to deliver answers to the questions. Some other terms contains similar meaning from data mining such as ââ¬Å"Knowledge miningâ⬠or ââ¬Å"Knowledge Extractionâ⬠or ââ¬Å"Pattern Analysisâ⬠. Data mining can also be treated as a Knowledge Discovery from Data (KDD). Some people simply mean the data mining as an essential step in Knowledge discovery from a large data. The process of knowledge discovery from data contains following steps. * Data cleaning (removing the noise and inconsistent data) * Data Integration (combining multiple data sources) * Data selection (retrieving the data relevant to analysis task from database) * Data Transformation (transforming the data into appropriate forms for mining by performing summary or aggregation operations) * Data mining (applying the intelligent methods in order to extract data patterns) * Pattern evaluation (identifying the truly interesting patterns representing knowledge based on some measures) * Knowledge representation (representing knowledge techniques that are used to present the mined knowledge to the user) 2.2 Data Data can be any type of facts, or text, or image or number which can be processed by computer. Todays organizations are accumulating large and growing amounts of data in different formats and in different databases. It can include operational or transactional data which includes costs, sales, inventory, payroll and accounting. It can also include nonoperational data such as industry sales and forecast data. It can also include the meta data which is, data about the data itself, such as logical database design and data dictionary definitions. 2.3 Information The information can be retrieved from the data via patterns, associations or relationship may exist in the data. For example the retail point of sale transaction data can be analyzed to yield information about the products which are being sold and when. 2.4 Knowledge Knowledge can be retrieved from information via historical patterns and the future trends. For example the analysis on retail supermarket sales data in promotional efforts point of view can provide the knowledge buying behavior of customer. Hence items which are at most risk for promotional efforts can be determined by manufacturer easily. 2.5 Data warehouse The advancement in data capture, processing power, data transmission and storage technologies are enabling the industry to integrate their various databases into data warehouse. The process of centralizing and retrieving the data is called data warehousing. Data warehousing is new term but concept is a bit old. Data warehouse is storage of massive amount of data in electronic form. Data warehousing is used to represent an ideal way of maintaining a central repository for all organizational data. Purpose of data warehouse is to maximize the user access and analysis. The data from different data sources are extracted, transformed and then loaded into data warehouse. Users / clients can generate different types of reports and can do business analysis by accessing the data warehouse. Data mining is primarily used today by companies with a strong consumer focus retail, financial, communication, and marketing organizations. It allows these organizations to evaluate associations between certain internal external factors. The product positioning, price or staff skills can be example of internal factors. The external factor examples can be economic indicators, customer demographics and competition. It also allows them to calculate the impact on sales, corporate profits and customer satisfaction. Furthermore it allows them to summarize the information to look detailed transactional data. Given databases of sufficient size and quality, data mining technology can generate new business opportunities by its capabilities. Data mining usually automates the procedure of searching predictive information in huge databases. Questions that traditionally required extensive hands-on analysis can now be answered directly from the data very quickly. The targeted marketing can be an example of predictive problem. Data mining utilizes data on previous promotional mailings in order to recognize the targets most probably to increase return on investment as maximum as possible in future mailings. Tools used in data mining traverses through huge databases and discover previously unseen patterns in single step. Analysis on retail sales data to recognize apparently unrelated products which are usually purchased together can be an example of it. The more pattern discovery problems can include identifying fraudulent credit card transactions and identifying irregular data that could symbolize data entry input errors. When data mining tools are used on parallel processing systems of high performance, they are able to analy ze huge databases in very less amount of time. Faster or quick processing means that users can automatically experience with more details to recognize the complex data. High speed and quick response makes it actually possible for users to examine huge amounts of data. Huge databases, in turn, give improved and better predictions. 2.6 Descriptive and Predictive Data Mining Descriptive data mining aims to find patterns in the data that provide some information about what the data contains. It describes patterns in existing data, and is generally used to create meaningful subgroups such as demographic clusters. For example descriptions are in the form of Summaries and visualization, Clustering and Link Analysis. Predictive Data Mining is used to forecast explicit values, based on patterns determined from known results. For example, in the database having records of clients who have already answered to a specific offer, a model can be made that predicts which prospects are most probable to answer to the same offer. It is usually applied to recognize data mining projects with the goal to identify a statistical or neural network model or set of models that can be used to predict some response of interest. For example, a credit card company may want to engage in predictive data mining, to derive a (trained) model or set of models that can quickly identify tr ansactions which have a high probability of being fraudulent. Other types of data mining projects may be more exploratory in nature (e.g. to determine the cluster or divisions of customers), in which case drill-down descriptive and tentative methods need to be applied. Predictive data mining is goad oriented. It can be decomposed into following major tasks. * Data Preparation * Data Reduction * Data Modeling and Prediction * Case and Solution Analysis 2.7 Text Mining The Text Mining is sometimes also called Text Data Mining which is more or less equal to Text Analytics. Text mining is the process of extracting/deriving high quality information from the text. High quality information is typically derived from deriving the patterns and trends through means such as statistical pattern learning. It usually involves the process of structuring the input text (usually parsing, along with the addition of some derived linguistic features and the removal of others, and subsequent insertion into a database), deriving patterns within the structured data, and finally evaluation and interpretation of the output. The High Quality in text mining usually refers to some combination of relevance, novelty, and interestingness. The text categorization, concept/entity extraction, text clustering, sentiment analysis, production of rough taxonomies, entity relation modeling, document summarization can be included as text mining tasks. Text Mining is also known as the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources. Linking together of the extracted information is the key element to create new facts or new hypotheses to be examined further by more conventional ways of experimentation. In text mining, the goal is to discover unknown information, something that no one yet knows and so could not have yet written down. The difference between ordinary data mining and text mining is that, in text mining the patterns are retrieved from natural language text instead of from structured databases of facts. Databases are designed and developed for programs to execute automatically; text is written for people to read. Most of the researchers think that it will need a full fledge simulation of how the brain works before that programs that read the way people do could be written. 2.8 Web Mining Web Mining is the technique which is used to extract and discover the information from web documents and services automatically. The interest of various research communities, tremendous growth of information resources on Web and recent interest in e-commerce has made this area of research very huge. Web mining can be usually decomposed into subtasks. * Resource finding: fetching intended web documents. * Information selection and pre-processing: selecting and preprocessing specific information from fetched web resources automatically. * Generalization: automatically discovers general patterns at individual and across multiple website * Analysis: validation and explanation of mined patterns. Web Mining can be mainly categorized into three areas of interest based on which part of Web needs to be mined: Web Content Mining, Web Structure Mining and Web Usage Mining. Web Contents Mining describes the discovery of useful information from the web contents, data and documents [10]. In past the internet consisted of only different types of services and data resources. But today most of the data is available over the internet; even digital libraries are also available on Web. The web contents consist of several types of data including text, image, audio, video, metadata as well as hyperlinks. Most of the companies are trying to transform their business and services into electronic form and putting it on Web. As a result, the databases of the companies which were previously residing on legacy systems are now accessible over the Web. Thus the employees, business partners and even end clients are able to access the companys databases over the Web. Users are accessing the application s over the web via their web interfaces due to which the most of the companies are trying to transform their business over the web, because internet is capable of making connection to any other computer anywhere in the world [11]. Some of the web contents are hidden and hence cannot be indexed. The dynamically generated data from the results of queries residing in the database or private data can fall in this area. Unstructured data such as free text or semi structured data such as HTML and fully structured data such as data in the tables or database generated web pages can be considered in this category. However unstructured text is mostly found in the web contents. The work on Web content mining is mostly done from 2 point of views, one is IR and other is DB point of view. ââ¬Å"From IR view, web content mining assists and improves the information finding or filtering to the user. From DB view web content mining models the data on the web and integrates them so that the more soph isticated queries other than keywords could be performed. [10]. In Web Structure Mining, we are more concerned with the structure of hyperlinks within the web itself which can be called as inter document structure [10]. It is closely related to the web usage mining [14]. Pattern detection and graphs mining are essentially related to the web structure mining. Link analysis technique can be used to determine the patterns in the graph. The search engines like Google usually uses the web structure mining. For example, the links are mined and one can then determine the web pages that point to a particular web page. When a string is searched, a webpage having most number of links pointed to it may become first in the list. Thats why web pages are listed based on rank which is calculated by the rank of web pages pointed to it [14]. Based on web structural data, web structure mining can be divided into two categories. The first kind of web structure mining interacts with extracting patterns from the hyperlinks in the web. A hyperlink is a structural comp onent that links or connects the web page to a different web page or different location. The other kind of the web structure mining interacts with the document structure, which is using the tree-like structure to analyze and describe the HTML or XML tags within the web pages. With continuous growth of e-commerce, web services and web applications, the volume of clickstream and user data collected by web based organizations in their daily operations has increased. The organizations can analyze such data to determine the life time value of clients, design cross marketing strategies etc. [13]. The Web usage mining interacts with data generated by users clickstream. ââ¬Å"The web usage data includes web server access logs, proxy server logs, browser logs, user profile, registration data, user sessions, transactions, cookies, user queries, bookmark data, mouse clicks and scrolls and any other data as a result of interactionâ⬠[10]. So the web usage mining is the most important task of the web mining [12]. Weblog databases can provide rich information about the web dynamics. In web usage mining, web log records are mined to discover the user access patterns through which the potential customers can be identified, quality of internet services can be enhanc ed and web server performance can be improved. Many techniques can be developed for implementation of web usage mining but it is important to know that success of such applications depends upon what and how much valid and reliable knowledge can be discovered the log data. Most often, the web logs are cleaned, condensed and transformed before extraction of any useful and significant information from weblog. Web mining can be performed on web log records to find associations patterns, sequential patterns and trend of web accessing. The overall Web usage mining process can be divided into three inter-dependent stages: data collection and pre-processing, pattern discovery, and pattern analysis [13]. In the data collection preprocessing stage, the raw data is collected, cleaned and transformed into a set of user transactions which represents the activities of each user during visits to the web site. In the pattern discovery stage, statistical, database, and machine learning operations a re performed to retrieve hidden patterns representing the typical behavior of users, as well as summary of statistics on Web resources, sessions, and users. 3 Classification 3.1 What is Classification? As the quantity and the variety increases in the available data, it needs some robust, efficient and versatile data categorization technique for exploration [16]. Classification is a method of categorizing class labels to patterns. It is actually a data mining methodology used to predict group membership for data instances. For example, one may want to use classification to guess whether the weather on a specific day would be ââ¬Å"sunnyâ⬠, ââ¬Å"cloudyâ⬠or ââ¬Å"rainyâ⬠. The data mining techniques which are used to differentiate similar kind of data objects / points from other are called clustering. It actually uses attribute values found in the data of one class to distinguish it from other types or classes. The data classification majorly concerns with the treatment of the large datasets. In classification we build a model by analyzing the existing data, describing the characteristics of various classes of data. We can use this model to predict the class/type of new data. Classification is a supervised machine learning procedure in which individual items are placed in a group based on quantitative information on one or more characteristics in the items. Decision Trees and Bayesian Networks are the examples of classification methods. One type of classification is Clustering. This is process of finding the similar data objects / points within the given dataset. This similarity can be in the meaning of distance measures or on any other parameter, depending upon the need and the given data. Classification is an ancient term as well as a modern one since classification of animals, plants and other physical objects is still valid today. Classification is a way of thinking about things rather than a study of things itself so it draws its theory and application from complete range of human experiences and thoughts [18]. From a bigger picture, classification can include medical patients based on disease, a set of images containing red rose from an image database, a set of documents describing ââ¬Å"classificationâ⬠from a document/text database, equipment malfunction based on cause and loan applicants based on their likelihood of payment etc. For example in later case, the problem is to predict a new applicants loans eligibility given old data about customers. There are many techniques which are used for data categorization / classification. The most common are Decision tree classifier and Bayesian classifiers. 3.2 Types of Classification There are two types of classification. One is supervised classification and other is unsupervised classification. Supervised learning is a machine learning technique for discovering a function from training data. The training data contains the pairs of input objects, and their desired outputs. The output of the function can be a continuous value which can be called regression, or can predict a class label of the input object which can be called as classification. The task of the supervised learner is to predict the value of the function for any valid input object after having seen a number of training examples (i.e. pairs of input and target output). To achieve this goal, the learner needs to simplify from the presented data to hidden situations in a meaningful way. The unsupervised learning is a class of problems in machine learning in which it is needed to seek to determine how the data are organized. It is distinguished from supervised learning in that the learner is given only unknown examples. Unsupervised learning is nearly related to the problem of density estimation in statistics. However unsupervised learning also covers many other techniques that are used to summarize and explain key features of the data. One form of unsupervised learning is clustering which will be covered in next chapter. Blind source partition based on Independent Component Analysis is another example. Neural network models, adaptive resonance theory and the self organizing maps are most commonly used unsupervised learning algorithms. There are many techniques for the implementation of supervised classification. We will be discussing two of them which are most commonly used which are Decision Trees classifiers and Naà ¯ve Bayesian Classifiers. 3.2.1 Decision Trees Classifier There are many alternatives to represent classifiers. The decision tree is probably the most widely used approach for this purpose. It is one of the most widely used supervised learning methods used for data exploration. It is easy to use and can be represented in if-then-else statements/rules and can work well in noisy data as well [16]. Tree like graph or decisions models and their possible consequences including resource costs, chance event, outcomes, and utilities are used in decision trees. Decision trees are most commonly used in specifically in decision analysis, operations research, to help in identifying a strategy most probably to reach a target. In machine learning and data mining, a decision trees are used as predictive model; means a planning from observations calculations about an item to the conclusions about its target value. More descriptive names for such tree models are classification tree or regression tree. In these tree structures, leaves are representing class ifications and branches are representing conjunctions of features those lead to classifications. The machine learning technique for inducing a decision tree from data is called decision tree learning, or decision trees. Decision trees are simple but powerful form of multiple variable analyses [15]. Classification is done by tree like structures that have different test criteria for a variable at each of the nodes. New leaves are generated based on the results of the tests at the nodes. Decision Tree is a supervised learning system in which classification rules are constructed from the decision tree. Decision trees are produced by algorithms which identify various ways splitting data set into branch like segment. Decision tree try to find out a strong relationship between input and target values within the dataset [15]. In tasks classification, decision trees normally visualize that what steps should be taken to reach on classification. Every decision tree starts with a parent node called root node which is considered to be the parent of every other node. Each node in the tree calculates an attribute in the data and decides which path it should follow. Typically the decision test is comparison of a value against some constant. Classification with the help of decision tree is done by traversing from the root node up to a leaf node. Decision trees are able to represent and classify the diverse types of data. The simplest form of data is numerical data which is most familiar too. Organizing nominal data is also required many times in many situations. Nominal quantities are normally represented via discrete set of symbols. For example weather condition can be described in either nominal fashion or numeric. Quantification can be done about temperature by saying that it is eleven degrees Celsius or fifty two degrees Fahrenheit. The cool, mild, cold, warm or hot terminologies can also be sued. The former is a type of numeric data while and the latter is an example of nominal data. More precisely, the example of cool, mild, cold, warm and hot is a special type of nominal data, expressed as ordinal data. Ordinal data usually has an implicit assumption of ordered relationships among the values. In the weather example, purely nominal description like rainy, overcast and sunny can also be added. These values have no relationships or distance measures among each other. Decision Trees are those types of trees where each node is a question, each branch is an answer to a question, and each leaf is a result. Here is an example of Decision tree. Roughly, the idea is based upon the number of stock items; we have to make different decisions. If we dont have much, you buy at any cost. If you have a lot of items then you only buy if it is inexpensive. Now if stock items are less than 10 then buy all if unit price is less than 10 otherwise buy only 10 items. Now if we have 10 to 40 items in the stock then check unit price. If unit price is less than 5à £ then buy only 5 items otherwise no need to buy anything expensive since stock is good already. Now if we have more than 40 items in the stock, then buy 5 if and only if price is less than 2à £ otherwise no need to buy too expensive items. So in this way decision trees help us to make a decision at each level. Here is another example of decision tree, representing the risk factor associated with the rash driving. The root node at the top of the tree structure is showing the feature that is split first for highest discrimination. The internal nodes are showing decision rules on one or more attributes while leaf nodes are class labels. A person having age less than 20 has very high risk while a person having age greater than 30 has a very low risk. A middle category; a person having age greater than 20 but less than 30 depend upon another attribute which is car type. If car type is of sports then there is again high risk involved while if family car is used then there is low risk involved. In the field of sciences engineering and in the applied areas including business intelligence and data mining, many useful features are being introduced as the result of evolution of decision trees. * With the help of transformation in decision trees, the volume of data can be reduced into more compact form that preserves the major characteristic
Sunday, January 19, 2020
Monster: The Autobiography of an L.A. Gang Member :: gangster crips, kody scott, eight tray
It was on the day of June 15th, 1975 that the world of eleven year old boy named Kody Scott would change completely. A month prior to this day, Kody was suspended from school for flashing a gang sign during the schoolââ¬â¢s panorama picture; from here it was evident where Kody was heading in life. Growing up in South Central Los Angeles, Kody was always surrounded by gangs and constantly witnessed the warfare created by rival gangs. Upon his return home from his sixth grade graduation Kody dashed out of the window in his room and ran to meet up with Tray Ball, a gang member of the Eight Tray Gangster Crips who had agreed to sponsor Kody into the gang. That night Kody was beaten senseless by the members of the set as a part of his initiation. Then, Tray Ball came and approached Kody with a pump shotgun that contained eight shells and said: ââ¬Å"Kody, you got eight shots, you donââ¬â¢t come back to the car unless they are all gone.â⬠The gang drove north into their enemy territory and eventually found and ambushed their target, a group of Bloods (the main enemy of the Crips). It was instant, gun shots rained from all directions, Kody shot six times before chasing an enemy blood who was then shot in the back by Kody. Kodyââ¬â¢s future in the gang was set in stone. He was accepted by all members immediately, especially Tray Ball, who encouraged him to pursue barbaric acts that made Kodyââ¬â¢s name soar in the streets. Two years later, at the age of thirteen, Kody was attempting robbery and proceeded to stomp on the man for about twenty minutes until the man was put into a coma at the hands of Kody. The police stated to bystanders that whoever did such an act was a monster, that name stuck to Kody and eventually became more prominent than his actual birth name. Needless to say, school was never Kodyââ¬â¢s main focus. Over the course of the next two years, Kody made it his only ambition to fight for the gang and promote the superiority of the E ight Tray Gangsters. Kodyââ¬â¢s end goal was to ultimately achieve the status of ââ¬Å"Ghetto Starâ⬠, a title given to a individual who is known throughout gang because of the barbaric acts they have committed in the name of their own gang set.
Saturday, January 11, 2020
Chinese philosophy Essay
. Introduction A. Thesis There are so many different philosophies and religions, and they greatly influence peopleââ¬â¢s life. In this paper, I am going to introduce and define the representatives of the Western philosophy such as Platoââ¬â¢s metaphysical Dualism and Chinese philosophy like Daoism. And I am going to compare these philosophies and explain the difference between them. ?. Dualism A. Explain Platoââ¬â¢s metaphysical Dualism Platoââ¬â¢s Dualism divided the reality into two different realms of existence (World of the Senses and World of the Forms). One world (the physical world) is constantly changing, and another world (the world of the Forms) is unchanging. Plato further divided these two different realms of existence. The world of Forms can be divided into the higher world (realm of the form) and the lower world (the Empirical world). The world of senses can be divided into physical objects (ordinary objects we perceive) and images (shadows, reflections and pictures). B. Summary of Allegory of Cave Plato explained his metaphysical dualism by using the Allegory of a cave. According to Marc Cohen: In the allegory, Plato likens people untutored in the Theory of Forms to prisoners chained in a cave, unable to turn their heads. All they can see is the wall of the cave. Behind them burns a fire. Between the fire and the prisoners there is a parapet, along which puppeteers can walk. The puppeteers, who are behind the prisoners, hold up puppets that cast shadows on the wall of the cave. The prisoners are unable to see these puppets, the real objects that pass behind them. What the prisoners see and hear are shadows and echoes cast by objects that they do not see. However, one day one of them is released from what keeps them the guy sitting, and they look back. At that time, the guy realizes that there are objects and the fire behind people and someone moves the objects. The shadow people have seen is a fake. People who are still sitting have never seen the objects behind them, so they believe that the shadow is real. The guy is free to move, so he starts to run to the exit of the cave. After getting out of the cave, the guy feels dizzy because the world out of the cave is too shiny. After a while, his eyes got used to the brightness and the beautiful world like the mountain, the sky, the river, or the sun is in the eyes of the guy. And then he realizes that the world out of the cave is real. He goes back to the cave and tells people who are still sitting in the cave what he saw out of the cave. However, they do not believe that what the guy told is the truth. C. Interpretation of the Allegory. By using this Allegory, we can think about todayââ¬â¢s world. There are too much in formation in the world, and the world seems to be narrower than before. Especially the appearance of mass media like newspaper, television, magazine, Internet, or SNS changed how we deal with information. Too much information is created and flows every day, and we can get the information you need any time by the device like a smart phone, a PC or a tablet. However, is the information you can get really the reality? The information created by mass media might be like the shadow in a cave. Before I was born, there was already too much information. I learned much information like Japanese history in a school, and I also know the new information of the incidence that occurs every day at the same time through mass media. So I learned most of things that occurred around the world through the information created by mass media, and the information is like a shadow in Allegory of a Cave. Suppose that an internal warfare is happening in one country. We know about that through mass media. We might see suffers in a TV or in a photo. We feel like understand everything about the war through the picture on TV or words of the News, but that is not a whole thing but just part of that. We need to seek the reality by my own eyes todayââ¬â¢s world. ?. Platoââ¬â¢s Legacy According to Philip, ââ¬Å"Plato thought that the soul could and would exist apart from the body and would exist after the death of the body. He offered a ââ¬Å"proofâ⬠for this position and was the first to do so in writing that we have any evidence of doing so. He offered several different proofs or arguments none of which are convincing todayâ⬠. His argument was that humans were composed of bodies and souls, but soul was more important and immortal. His arguments used premises that are questionable today. For example, Plato thought he could conclude that the soul could exist separating from the body because it worked independently from the body when it engaged in pure thought. But today, it is proved that how we think depends on the physical brain works. So this is no longer accepted as true. Plato thought that they are remembering the knowledge implanted in their souls when the souls were in the realm of pure thought and eternal forms before entering into the body after which they forgot as they became confused by physical emotions and feelings and limited experiences through the senses. And that is the only way to explain how people come to know. This is no longer accepted as the best explanation of how people come to have knowledge. However, Plato is credited with being the first human to attempt to set out any sort of a proof that humans had souls and that they survived the death of the body and that they were immortal. A. Descartes-Substance Dualism According to Philip, ââ¬Å"Descartes also believed that the soul existed prior to and separate from the body, and it was immortal. In his view, all of reality consisted of two very different substances: matter or the physical and spirit or the non-physical. â⬠The physical was what would be extended in time and space and the non-physical would not be characterized. He thought that his famous claim that ââ¬Å"I think therefore I amâ⬠established not just that he existed but that he existed without a body as a ââ¬Å"thinking thingâ⬠. A ââ¬Å"thinking thingâ⬠is a thing that thinks and by that would be included: imagining, conceiving, hoping, dreaming, desiring, fearing, conjecturing, reasoning, remembering and more. For him a ââ¬Å"thinking thingâ⬠needed no physical parts to do what it does. Modern science has established that there is no evidence of humans that are without a physical body and its brain. There is no evidence that thought is possible without a brain. There is much evidence that what has been associated with Descartesââ¬â¢ ââ¬Å"thinking thingâ⬠is now explained solely in term of the brain and how the brain is physically structured and the functioning of the brain. B. Aquinas According to the text, ââ¬Å"Saint Thomas Aquinas is the philosopher who explained five ways to demonstrate the existence of the God within the framework of a posteriori (the knowledge comes from, or after the experience) and developed cosmological and teleological arguments. â⬠I am going to explain one of the demonstrations. The way is from the nature of efficient cause. In the world of sensible things, there is an order of efficient causes. It never happens that the thing is the efficient cause of itself. If you look at one phenomenon, you can see many efficient causes behind it. But you cannot go back to infinity. There must be the first efficient cause. Aquinas claims that that is the God. The Aquinasââ¬â¢ claim is similar to Platoââ¬â¢s claim. He thought the God is the first efficient cause and independent one. That is close to the concept ââ¬Å"the realm of the formâ⬠Plato claimed. And the things in the world of sensible things are the secondary things of the God. It is close to ââ¬Å"the Empirical worldâ⬠Plato claimed. ?. Chinese Natural Cosmology A. Ames `Image of Reason in Chinese Cultureâ⬠Ames claims the difference between the dominant conceptions of reality in the West and in the Chinese tradition in his ââ¬Å"image of Reason in Chinese Cultureâ⬠. According to the text, Ames claims that ââ¬Å"to explore the Chinese philosophy, he thought you needed to recognize at least that you are dealing with a fundamentally different world if you are familiar with Western culture. To bring into relief certain features of the dominant Indo-European view and Chinese alternative to it, he constructs a ââ¬Å"logical sense of order with an ââ¬Å"aestheticâ⬠order. â⬠What we call ââ¬Å"logicalâ⬠sense of order has developed Western philosophical and religious orthodoxy, and it is based on the presumption that there is something permanent, perfect, objective, and universal that disciplines the world of charge and guarantees natural and moral order-some originative and determinative arche, an eternal realm of Platonic edios or ââ¬Å"ideasâ⬠, the One True God of the Judeo-Christian universe, a transcendental strongbox of invariable principles or laws, an annalistic method for discerning clear and distinct ideas. In a single-order world, the One God is the initial beginning of the universe. The God is primal and unchanging principle that causes and explains that origin and issues everything from itself, and that is familiar and presupposition in Western tradition. Although the world is explained by ââ¬Å"logicalâ⬠order in Western tradition, however, there is no ââ¬Å"logicalâ⬠order in Chinese philosophy. The order of Chinese tradition is immanent in and inseparable from a spontaneously changing world. The universe possesses within itself its organizational principles and its own creative energy. In the view of Chinese tradition, the world creates itself. That is scandalous from the view point of Western scholar reason. The yin and the yan come together and guide the infinite combination of these two opposite source of energy. These two sources of energy make a spontaneous intelligence possible. Yin and yan as the characterization of a particular relationship invariably entail a perception from some particular perspective that enables us to unravel patterns of relatedness and interpret our circumstances. They provide a vocabulary for sorting out the relationship among things as they come together and constitute themselves in unique compositions. Ames also mentions the Chinese word ââ¬Å"liâ⬠. In both classical Chinese corpus and modern language, the closest term that approximates ââ¬Å"reasonâ⬠or ââ¬Å"principleâ⬠. He claims that identifying the meaning of the word ââ¬Å"liâ⬠correctly is essential to understand Chinese philosophy. According to the text, ââ¬Å"Philosophically, the most familiar uses of li lie somewhere in the cluster ââ¬Å"reasoningâ⬠or ââ¬Å"rationaleâ⬠(A. S. Cua), ââ¬Å"principleâ⬠(W.T. Chan), ââ¬Å"organismâ⬠(J. Needham), and ââ¬Å"coherenceâ⬠(W. Peterson). â⬠Among these several alternative translations used for ââ¬Å"liâ⬠, although philosophically as protean as ââ¬Å"principleâ⬠for Western tradition, unwarrantedly restrict li to a notion of human consciousness and tend to introduce distinction such as animate and inanimate, agency and act, intelligible and sensible. Li is much different from being some independent and immutable originative principle that disciplines a recalcitrant world. It is the fabric of order immanent in the dynamic process of experience. That is why ââ¬Å"psychologyâ⬠is translated in to Chinese as ââ¬Å"the li of the heart-and-mind,â⬠but then ââ¬Å"physicsâ⬠is ââ¬Å"the study of the li of things and events. â⬠What separates li rather clearly from Western common understanding of ââ¬Å"principleâ⬠is that li is both a unity and a multiplicity. Li is the coherence of any ââ¬Å"member of a set, all members of a set, or the set as a whole. â⬠Both the uniqueness of each particular and the continuities that obtain among them are reflected by this description. Li then is the defining character or ethos of a given community, or any other such composition. Ames also claims another point at which li departs from ââ¬Å"principle. â⬠In Western tradition, the discovery of originative and determinative principle gives us a schema for classifying things and subsuming one thing under another. That is why people seek ââ¬Å"principleâ⬠in Western tradition. However, the investigation of li, by contrast, is to seek out patterns that relate things, and to discover resonances between things that make correlations and categorization possible. B. Hans-Gorg Moeller In Daodejing, the meaning of ââ¬Å"the rootâ⬠is described by using metaphor. From the view of the Daoist, our world is a ââ¬Å"self-generatingâ⬠process. In Daoism there is no initial beginning for ââ¬Å"logicalâ⬠order. In Daoism, order is immanent in and inseparable from a spontaneously changing world, and then ââ¬Å"the world creates itself. â⬠In this point of view, the role of ââ¬Å"the rootâ⬠is very important. ââ¬Å"The rootâ⬠is an origin of phenomenon, and many things are derived from ââ¬Å"the root. â⬠Unlike many Western philosophical perspectives, this ââ¬Å"rootâ⬠has a somewhat unique, interesting, and different meaning in it. The Western philosophyââ¬â¢s principle or arche is the first cause of the event. And nothing would exist if there were not any of the Western philosophyââ¬â¢s principle or arche. However, the concept of ââ¬Å"the rootâ⬠is different of that. From the Daoist perspective, ââ¬Å"the rootâ⬠is a part of the plant. ââ¬Å"The rootâ⬠does not exist before the plant although plant cannot exist if there is no ââ¬Å"the root. â⬠That is, ââ¬Å"the rootâ⬠itself is not a creator of the plant. It is the origin of the growth of the plant. ââ¬Å"The rootâ⬠is buried in the soil or ground, so it is invisible. However, ââ¬Å"the rootâ⬠greatly influences its visible part. This illustrates Daoistââ¬â¢s ââ¬Å"autopoiesis,â⬠self-generating concept well, which differs a lot from the Western philosophical concept ââ¬Å"arche,â⬠which is stated or recognized as ââ¬Å"the God. â⬠?. Comparative Epistemology A. Hellenistic-Prescriptive theoretical knowledge In Western tradition, most of philosophers think there is one principle or one God and things happen from it. And the mind is separable part from the body. One of the examples is Plato. Platoââ¬â¢s dualism is that there are the realm of form and the imperial world. The body belongs to the imperial world and the world is constantly changing. And what we sense by the body is limited, and the Empirical world is not real. The true world is the realm of the form, and the mind belongs to the world. Plato argues that the ââ¬Å"knowledgeâ⬠continuously exists and must be justified conviction. However, the Empirical world that we belong to is contentiously changing, and there is no unchanging thing in the world. That is why there is no thing from which we can get ââ¬Å"knowledgeâ⬠in the Empirical world, so we cannot get ââ¬Å"knowledgeâ⬠by our own senses. The unchanging things exist in the realm of the form, and we cannot reach the world by using our senses. So we need to use our mind to get ââ¬Å"knowledgeâ⬠. Not all of Western philosophers claim like that, but most of philosophers claim that the truth does not exist in the world where we live today. This concept greatly influences Christianity or other religions that have the one God. In Christianity, there is one God named ââ¬Å"Jesus Christâ⬠, and he is the reason why things happen or why we live. People pray to seek ââ¬Å"knowledgeâ⬠that exists in the world where we are not living. That is, we cannot get the ââ¬Å"knowledgeâ⬠about it in the world where we are living, and we need to get it from the other world to know the essence of the things. B. Chinese philosophy-Prescriptive practical On the other hand, there is no the one God in Chinese philosophy. In China, the war occurred constantly, and Chinese dynasty changed over time, so people did not come to rely on one thing. This influenced the Chinese philosopher. Instead of one god or one principle, Chinese philosophers think that the world creates itself and that the world is constituted by the combination of determinacy and indeterminacy, and spontaneous, dynamic changing is the universal principle of the world. In Western tradition, the philosophers try to attribute many phenomenons to the one reason. However, Chinese philosophers think that each thing is ââ¬Å"self-soâ⬠creativity, self-generating, and spontaneous. For Chinese philosophy, the Nature is very important, and in Daoism it is important not to try to force thing. That is why there is the concepts in Daoism; wu wei(without intentional action), wu si(without deliberate thought), wu si(without selfish interesting), wu ji(without self-awareness), wu zhi(without knowledge), wu xin(without heart-and-mind). Daoists claim that when you are thinking something, the world is also changing at the same time, so you are missing something. That is why it is important for Daoism to stop thinking by your head, get out of the world of your head, look around the world, and take action. The most important thing for Daoism is that we ought to take action as a part of the world. ?. Conclusion There have been so many philosophers through the history, and each of philosophy has been developed around the world. And how people think about the world is different, depending on the philosophy. Of many philosophies, the significant different philosophies are the Western and Chinese philosophy. In the Western philosophy, the philosophers try to attribute everything to the one principle or the one God. On the other hand, there is no principle, and the philosophers have recognized the world as self-generating process, and the world is the source of itself without no exact start and end point. This thought influences religion and how people think about the world. Around the world, many wars related to religion occur today. The difference between the religions is just what ancient people developed, so it is important to try to understand the difference in todayââ¬â¢s world. Reference Pecorino, Philip, Ph. D. ââ¬Å"Chapter 6 : The Mind-Body Problem Section 3: DUALISM. â⬠Introduction To Philosophy an Online Textbook. Queensborough Community College, CUNY, n. d. Web. 4 Dec 2013. . Deutsch, Eliot. Introduction to World Philosophies. 1st ed. 509. New Jersey: A Pearson Education Company, 1997. Ex-255-256. Print. Deutsch, Eliot. Introduction to World Philosophies. 1st ed. 509. New Jersey: A Pearson Education Company, 1997. Ex-469. Print. Cohen, Marc. ââ¬Å"The Allegory of the Cave. â⬠Philosophy 320 History of Ancient Philosophy. University of Washington, 07 11 2013. Web. 4 Dec 2013. .
Friday, January 3, 2020
The Commonly Confused Words Explicit and Implicit
In some contexts (as explained in the usage notes below), the words explicit and implicit are antonyms ââ¬â that is, they have opposite meanings. Definitions The adjective explicit means direct, clearly expressed, readily observable, or laid out in full. The adverb form is explicitly.The adjective implicit means implied, unstated, or expressed indirectly. The adverb form is implicitly. Examples I gave you an explicit order. I expect to be obeyed.(James Carroll, Memorial Bridge. Houghton Mifflin, 1991)Most states consider sexually explicit images of minors to be child pornography, meaning even teenagers who share nude selfies among themselves can, in theory at least, be hit with felony charges that can carry heavy prison sentences and require lifetime registration as a sex offender.(Associated Press, Teen Sexting Prompts Efforts to Update Child Porn Laws. The New York Times, March 17, 2016)Love is one of those words that illustrate what happens to an old, overworked language. These days with movie stars and crooners and preachers and psychiatrists all pronouncing the word, its come to mean nothing but a vague fondness for something. In this sense, I love the rain, this blackboard, these desks, you. It means nothing, you see, whereas once the word signified a quite explicit thing--a desire to share all you own and are with someone else.(John Updike, Tomorrow and Tomorrow and So Forth. The Early Stories: 1953-1975. Random House, 2003)You must listen carefully and critically to understand Snoops implicit message.In academia, implicit bias, or implicit racial bias as it is here, refers to subtle forms of possibly unintentional prejudice affecting judgment and social behavior.(Rose Hackman, Black Judge Effect: Study of Overturning Rates Questions If Justice Is Really Blind. The Guardian [UK], March 17, 2016) Usage Notes These two words come from the same Latin root meaning to fold. When something is explicit, its unfolded, laid open for people to see. Implicit is the opposite of that. It means folded in, in the sense that its meaning is covered or contained within something else and isnt explicit. . . .An explicit statement makes a point distinctly, openly, and unambiguously. . . . An explicit picture, book, film, etc. depicts nudity or sexuality openly and graphically. . . .When something is implicit, its implied, not plainly stated. . . . Implicit belief, implicit confidence, implicit faith, etc., involve having no doubts or reservations.(Stephen Spector, May I Quote You on That?: A Guide to Grammar and Usage. Oxford University Press, 2015)The words seem perfect antonymsââ¬â but for the unexpected fact that they join in implying that what they describe is undoubtable. Implicit trust is as firm as explicit trust because quite as real. Note that implicit makes its point absolutely but that impli ed requires telltale loose ends (see imply, infer). . . . Tacit is often used in the same way as implicit. A tacit reconciliation is one that both parties acknowledge and act upon without speaking of it.(Wilson Follett, Modern American Usage: A Guide, rev. by Erik Wensberg. Hill and Wang, 1998) Practice (a) Though most people would agree that the media almost never deliver a message that explicitly encourages violence, some people argue that violence in the media carries the _____ message that violence is acceptable.(Jonathan L. Freedman, Media Violence and Its Effect on Aggression, 2002)(b) Cigarette packs carry _____ health warnings. Answers to Practice Exercises (a) Though most people would agree that the media almost never deliver a message that explicitly encourages violence, some people argue that violence in the media carries the implicit message that violence is acceptable.(Jonathan L. Freedman, Media Violence and Its Effect on Aggression, 2002)(b) Cigarette packs carry explicit health warnings.
Subscribe to:
Posts (Atom)