Data Mining

Discovering hidden knowledge from massive amount of corporate data by using machine learning and artificial intelligence techniques. InfoTelica has developed unique custom data mining solutions for different business cases.
“Multidimensional Combinational Time series Anomaly Analysis” developed by InfoTelica has monitoring and surveillance system all over corporate time series data drawings and detects any existing anomalies on all possible charts.

Application Areas:

Business Analysis, Market Analysis, Customer Segmentation, Performance Prediction, Credit Approval, Target Marketing, Fraud Detection, Business Anomalies, Pattern Recognition, Spatial Data Analysis, Network Analysis, Text Mining

Major Data Mining Tasks:

Classification.

Clustering.

Association.

Visualization.

Summarization.

Deviation.

Estimation.

Link Analysis.

Data Mining Techniques:

Classification Model.

Anomaly Detection.

Regression.

Attribute Importance.

Clustering.

Association Rules.

Feature Extraction.

Classification Model:

Classification arranges the data into predefined groups by using machine learning techniques.

Data Classification Process:
1. Learning: Training data are analyzed by a classification algorithm. The learned model or classifier is represented in the form of classification rules.
2. Classification: Test data are used to estimate the accuracy of the classification rules. If the accuracy is considered acceptable, the rules can be applied to the classification of new data.

Classification Algorithms:

Naive Bayes (NB)

Adaptive Bayes Network (ABN)

Support Vector Machine (SVM).

Decision Tree (DT)

Classification Application Areas:

Performance Prediction

Credit Approval

Target Marketing.

Fraud Detection

Medical Diagnosis

Anomaly Detection:

Anomaly detection finds data instances that are unusual and do not fit any established pattern. On transactional data anomaly detection concentrates on modeling what is normal behaviour in order to identify unusual transactions.

Anomaly Detection Algorithms:

One-Class Support Vector Machine (SVM)

Anomaly Detection Application Areas:

Fraud detection.

Network Intrusion.

Business Anomalies and Outliners.

Regression:

Regression is a data mining (machine learning) technique used to fit an equation to a dataset. The simplest form of regression, linear regression, uses the formula of a straight line (y = mx + b) and determines the appropriate values for m and b to predict the value of y based upon a given value of x. Advanced techniques, such as multiple regression, allow the use of more than one input variable and allow for the fitting of more complex models, such as a quadratic equation.

Regression Algorithms:

Generalized Linear Models (GLM)

Support Vector Machine (SVM)

Regression Application Areas:

Fraud detection.

Network Intrusion.

Business Anomalies and Outliners.

Attribute Importance:

An attribute importance model identifies the relative importance of an attribute in a predicting given outcome. Attribute Importance ranks the variables based on their importance. AI algorithm measures univariate correlation to the target, that is the attribute considered as a one-predictor model of the target.

Attribute Importance Algorithms:

Minimal Descriptor Length (MDL)

Attribute Importance Application Areas:

Importance of independent attributes.

Clustering:

Clustering is the process of grouping the data into classes or clusters, so that objects within a cluster have high similarity in comparison to one another but are very dissimilar to objects in other clusters. Clustering can also be used for outlier detection.
Four main approaches:

Partitioning Task: divide the data into a given number of clusters

Hierarchical : Create a tree based on the similarity / distance of items

Density-based : Find contiguous areas with high density

Grid-based :Divide the data space into grid cells

Clustering Algorithms:

Enhanced k-means (KM)

Orthogonal Clustering (O-Cluster or OC)

Clustering Application Areas:

Customer segmentation.

Pattern recognition.

Spatial data analysis.

Association Rules:

Association rules show strong associations between attribute-value pairs (or items) that occur frequently in a given data set. Association rules are commonly used to analyze the purchasing patterns of customers in a store. The discovery of association rules is based on frequent itemset mining.

Association Rules Algorithms:

Apriori (AP)

Association Rules Application Areas:

Market Analysis.

Cross Marketing.

Feature Extraction:

A feature extraction model creates an optimized data set on which to base a model.

Feature Extraction Rules Algorithms:

Non-Negative Matrix Factorization (NMF)

Feature Extraction Rules Application Areas:

Text Mining