Nowadays, organizations are developing several technological systems to gather data that help the management people to develop their understandings in the context of their customers without spending huge time. This essay would discuss about data mining and its importance to an organization for gaining competitive advantages from related marketplace.
Data mining is a process that is used to make analysis of data from different dimensions or angles and summarize the data into constructive information. This information can be used for cutting cost, increasing revenue, or both. It is generally used by companies to make a strong relationship among internal elements such as product positioning, price, or skills of staff, and "external" elements such as customer demographics, competition and economic indicator to ensure the customer satisfaction. It also facilitates companies to determine organizational strategies impacts on sales, corporate profits and target customer groups.
Benefits of Data Mining
Followings are the benefits of data mining:
Predictive analytics to understand customers’ behavior: Data mining tool helps the organizations for forecasting the behavior of customers with recent trends. It enables the management of an organization to find out the wants of customers and take it into consideration of top management to improve the quality of products or services. Hence, this tool also facilitates the management people for producing and promoting need based products fruitfully. With the help of this, the people of management could be able to apply a change strategy for producing products based on customer needs that would associate the customers with new products for long-term effectively.
Association discovery in products sold to customers: Through mining of data, this method helps to find out the pattern of buying that enables the retailers of the organizational products for analyzing the behavior of customers at the time of purchasing. With the help of this, organizations could save the time of customers and increase their sales. For example, through the analysis of past information and data of customers, the insurance policy providers could find the need for healthcare services of the customers, when they have an insurance policy before it.
On the basis of above dissuasion, it can be concluded that data mining facilitates companies to accomplish business purposes fruitfully. Through data mining, companies formulate effective customer oriented business strategies for gaining competitive advantages.
Credit card fraud has proven to be many financial institutions big problem. Banks are losing a lot of money due to credit card fraud. This paper discusses three models for detecting credit card fraud using different data mining techniques. Data mining is a way of generating patterns from the data and divides this data according to how it relates. Data mining techniques includes clustering, decision trees, and neural networks. The core assignment is to evaluate different opinions on the same problem and get knowledge from the applications which are developed using data mining techniques.
Credit card fraud has turned out to be a major problem for the banks using this technology. It is an e-commerce technology of paying for goods and services without using cash in hand. With the current systems of detecting credit card fraud, there is a problem of accessing bank databases because most bank databases are huge and they are many (Ogwueleka 2011). Most of these systems fail to work with these kinds of databases; this hinders the solving of the problem. Some banks also fails to frequently obtain updated fraud patterns, these banks might continuously suffer fraud attacks (Chaudhary et al. 2012). This situation also fails the systems because most of the credit card fraud detection system uses the fraud patterns to discover if a transaction is fraudulent or it’s legitimate. If the database does not update the fraud patterns frequently then the system will not be able to work to its level best because these patterns changes, so they need to be updated frequently. There is also a chance that transactions made by the fraudsters in the past fit in the pattern of normal behavior that is being counted as a legitimate transactions. Also the profiles of a fraudulent behaviour changes constantly so the system has to take into account this problem(Sahin et al. 2013).
2 Data Mining Techniques Used for Credit Card Fraud Detection
(Dheepa & Dhanapal (2009); Modi et al. (2013); Delamaire et al (2009)) agree that clustering breaks down data in a way that generate patterns. Clustering allows for identification of an account which has their trend of transactions changing, that is displaying a behaviour it has never displayed before. Unlike other techniques clustering does not require the model to know the past fraudulent and non -fraudulent transactions (Ganji 2012). Clustering is unsupervised learning model. Ganji (2012) proposes a model which uses the outlier and reverse k Nearest Neighbour which are both clustering techniques. With outlier technique an observation is made that diverges so much from other observations that raise suspicion. It does not learn from past transactions from the database instead it detects changes in behaviour or transactions which are not usual (Ganji 2012; Dobbie et al. 2010). The graphs below show how outlier classifies transactions.
Figure 1.graphs showing how outliers work (Ganji 2012)
Figure2. SODRNN algorithm pseudo code (Ganji 2012)
The outlier detection was combined with the algorithm above which is called reverse K-nearest neighbour to develop Stream Outlier Detection on Reverse K- Nearest Neighbours (SODRNN). This algorithm has 2 processes being stream managing process and the query managing process. There is also a window which should be allocated in memory. The approaching data stream objects are received by the past process and it efficiently brings up to date the recent window. Upon the coming of the new stream objects, to keep the recent window, it will need to keep k-nearest neighbour list and reverse k- nearest neighbour list of the affected objects in the recent window instead of that the whole data stream objects in the recent window. During the insertion of the new oncoming objects, it will go through scan through to the recent window to look for the objects that their K-nearest neighbour is affected. When the k-nearest neighbour list of the objects in the recent window is updated, they also update reverse k- nearest neighbour list. When the top m query is demanded the latter process will scan the recent window and return m objects that their reverse k- nearest neighbour (p) is same as of this query.
Real datasets were used to perform the experiments. To evaluate the detection performance, information about the outliers were assumed in all the experiments. SODRNN as implemented and conducted on a PC with the following features:
‘ Pentium D 3.1GHz
‘ 1 GB Memory
‘ Windows XP
To carry out the experiment, a dataset with a certain magnitude was chosen, the random number generator created in the highest dimensional space equally spread data, which includes the 10, 000 multi-dimensional space point data. The equally spread data ware tested the X * X tree index structure and actual take up memory. This is shown on Figure 3, when the dimension increases, X * X tree index structure and the actual memory space which is occupied by an equivalent increase, because all the nodes in the array of dataset with the MBR with the increase of dimension up more memory space, because the X directory tree wants all the nodes in all the dataset additional storage node split in the history record. The following figure shows the results.
Figure 3: main memory requisitions of the two indexes structure for different dimension (Ganji 2012)
This model is good because there is no need to train data which is usually expensive to implement method which train data. It also able to detect previous undiscovered types of fraud unlike supervised methods. This model also has reduced the number of scans to one compared to other models. With the experiment that was carried out the model has proven to be efficient and effective. This model makes it easier to determine fraudulent transactions and legitimate transactions looking at the spending nature of the card holder. Although clustering can be a very good technique, there are situations where this technique is likely to fail. Situations like when the fraudulent transactions fall in the same pattern as the legitimate transactions. This will be difficult to notice and difficult to solve. Frausters can learn the pattern of the legitimate transactions and make sure that their transactions follows the same patterns, to make it hard for the system to notice.
2.2 Neural Networks
Neural networks work like a brain of a human being. The technique learns for the past experience for it to predict and classify the data (Akhilomen (2013); Ogwueleka (2011); G??nther & Fritsch (2010); Chaudhary et al. (2012); Sherly (2012)). In this way neural nets in credit card fraud does the same thing, they learn the legitimate and fraudulent transactions. Then after learning they will be in a position to predict and classify transactions. Other methods used for credit card fraud apart from neural networks have limitations such as they do not have the ability to learn from the past experience. They do not have the ability to predict the future looking at the present situation. Neural networks work in such a way that the linear combination of the nodes are compared and if the inputs weight connections exceeds the threshold the activation key fires.
Ogwueleka (2011) proposed a system application that used neural networks method called Self-Organizing Maps (SOM) to predict transactions which are fraudulent and the ones that are legitimate. The system used four categories being low, high, risky and high risky. Only legitimate transactions are processed the ones which falls in other groups are labelled suspicious or fraudulent and they will not be processed. Ogwueleka (2011) carried out the experiment which was done following the steps below:
‘ Choose a suitable algorithm.
‘ Use the algorithm with dataset that is known
‘ Evaluating and refining the algorithm which is being tested with other datasets.
‘ Discuss the results.
According to Ogwueleka (2011) when the transactions is made this application will run secretly in the background and check if the transaction is legitimate. The system has two subsystems being the database where the transactions are read into the system and the credit card fraud detection engine which check the transactions when they are performed if they are legitimate.
The detection system has two components which are the withdrawal and the deposit. Each component has subcomponents which are: the database, the neural network classification and visualization. The database was tested to make sure that all the needed dataset is brought into model and the model uses it. SOM algorithm was used in neural network classification. This is where the dataset loaded from the database be divided into a training dataset and test dataset. Training dataset was divided further into sub units used for elimination of the model and a subset will be used to evaluate the system performance.
The data being tested was prepared and used on the system with the program that is being tested. Result from the test was analysed with physically arranged results for the effectiveness of the new model to be determined. To measure effectiveness of the application this was done in terms of classification errors. Classification errors consisted system detection rate and false alarm rate. The dataset was designed from transactions made per day in a month in a Nigerian bank. The table below shows the performance results.
Figure 4 performance results (Ogwueleka 2011)
MATLAB software package was used to analyse the performance of the detection algorithms and the results were compared with the collected data which are shown below.
Figure5 Receiver Operating Curve (ROC) for withdrawal fraud detection (Ogwueleka 2011)
Figure 6 Reciever Operating Curve for deposit fraud detection (Ogwueleka 2011)
PD =probability of false negative
PF =probability of false positive
When compared to other models used for detecting fraud using the ROC curve, credit card fraud detection watch has proven to be the best in performance. The results also proved the reliability and accuracy of the credit card fraud detection using neural network. When testing for the feasibility neural network tools for credit card fraud detection watch, two commercial products being quadratic discriminates analysis (QDA) and logistic regression (LOGIT) were used. Figure 6 shows results of the comparison of performance analysis of the credit card fraud detection watch model with QDA and LOGIT.
In figure 7, credit card fraud detection watch ROC curve shows the detection of over 95% of fraud cases without causing false alarms. It is followed by logistic regression ROC curve which shows the detection of 75% of fraud cases with no false alarm. With quadratic discriminant analysis, it detected only 60%. This proves that credit card fraud detection watch performs better.
Figure 7 comparisons of credit card detection watch with the fraud detection system ROC for deposit fraud detection (Ogwueleka 2011)
The experiment results prove that indeed this model is efficient and reliable. Its performance of detecting 95% of fraud cases shows that it is suitable for solving credit card fraud. The model uses four clusters unlike other models which uses 2 clusters which are normally used and has bad performance. For the performance of the model to be raised, the author should consider the use of back propagation technique of neural networks which when fraud is detected the system will send back the transaction for the patterns to be updated. This will help in fast and reliable fast pattern updates and to help the model deal with different feature types and detect the errors in large amount of transaction of credit card system.
2.3 Decision Trees
A decision tree is a technique where nodes are given names of the attributes and branches given attributes values that fulfil certain condition and ‘leaves’ that contain an intensity factor which is defined as the ratio of the number of transactions that satisfy these condition(s) over the total number of legitimate transaction in the behaviour (Delamaire et al, 2009). When the decision tree starts there has to be a question which has more than two answers. Every answer points to more question to help with the classification and identifying the data so it can be grouped so that a prediction can be made.
Sahin et al. (2013) proposed a fraud detection model called cost-sensitive decision tree. The main aim of this model is to minimize misclassification cost and thus making the model highly accurate. This will recover large amounts of financial loses and increase customer loyalty. To test the approach the credit card data from the bank was used which was from the bank’s credit card data warehouse. This data was used to form training data used in the modelling of the technique and the test data which is involved in the testing of trained models. 978 of fraud transactions and 22 million of legitimate transactions made the sample data which was then sampled using stratified sampling to reduce the number of legitimate transactions until the number reached 11344000 left with 484 of the transactions being fraud transactions (Sahin et al. 2013). The table below shows distribution of the training and test data.
Figure 8 Distribution of the training and test data (Sahin et al. 2013)
In other decision trees, the criteria of splitting can be insensitive, and the distributions class or the cost is fixed to a ration which is fixed in a way that the classification of transactions that are fraudulent are as legitimate that is false negative ‘n’ multiplied by the cost of legitimate classification transactions as fraudulent that is false positive. These misclassification algorithms are considered when pruning takes place, not the induction process. In this new approach fraud will be classified looking at how much it will cost, that is the fraudulent transaction with the highest cost will be detected first then the ones with the small cost will follow. Performance of this model together with the other models are compared over test data done over saved loss rate (SLR) which illustrates the saved percentage of the possible financial loss.
The models that prove to be the best in this experiment between the methods which were developed with the same method which were developed with the same method but using changed parameters is compared with other method developed with the cost sensitive decision tree algorithm proposed. Six models that were developed using traditional decision tree algorithm were chosen and applied in SPSS PASW modeller. The chosen models were developed using C5.0, CART, CHAID and CHAID with a fixed cost ratio of 5-1, Exhaustive CHAID and Exhaustive CHAID with a cost of ratio 5-1.
Figure10 statistical performance of ANN models (Sahin et al. 2013)
The table below shows the performance of all the chosen models.
Figure 10 performance table (Sahin et al. 2013)
Figure 11 and 12 shows the performance of cost sensitive tree models and other models. The figures prove that the cost sensitive decision tree models excluding CS-Direct Cost which has saved more resources than others. Mostly banks are more concerned with the overall financial loss or how to recover than the fraudulent transactions detected. This new proposed method will help the financial institutions in recovering the money lost due to fraud. Also the cost sensitive models work well than traditional classifiers in terms of the number of detected false transactions. The decision trees are a good initiative because they keep pruning themselves to remove the data that reflect noise data; they remove this kind of data from the tree to prevent a situation where the tree becomes large and complex with features that are not important. Also the decision trees automatically create knowledge from data and can discover new knowledge. Most systems do not implement and they cannot deal with contradictory examples. Also because tree can become large and difficult to understand, this makes it difficult for the developers to use this technique that is why the technique is not widely used.
Figure 11 Performance of models w.r.t. True Positive Rate (TPR) (Sahin et al. 2013).
Figure 12 Performances of models w.r.t. Saved Loss Rate (SLR) (Sahin et al. 2013)
3 Lessons Learnt
Data mining techniques has been proved to be efficient and effective in the field of credit card fraud detection. With the techniques explored in this paper, the results of the experiments carried out have proven beyond unreasonable doubt that indeed data mining techniques can be applied in this field. And also it can help solve this problem of credit card fraud. The models that have been developed using data mining techniques have capability of updating its patterns whenever a change occurs in the transactions database. Also the models developed using data mining techniques detect fraud in real time. Real time models detect fraud at the time of the transaction that is that time when the transaction takes place. Unlike models which detect fraudulent transactions after they happen. These models are able to stop the transaction process when they think it is fraudulent and 95% of the time, the models are correct.
The use of data mining techniques for detecting credit card fraud brings in models of better performance. The models that were discussed in this paper are good enough to be used to stop credit card fraud because many banks around the world are losing a lot of money due to this fraud. And its every bank’s wish to have an effective and efficient credit card fraud detection system (Raj & Portia 2011). The proposed models uses the real data from the banks’ databases to test their models which proves that if they can be given a chance they can reduce fraud because they have proven to be working in with the same everyday transaction data of card holders. Most of these models runs in the background of the bank system therefore it does not introduce anything new to the customer, nor does it interfere with their purchasing and paying of the goods and services. In short the customers will hardly know it even exist because it will be hidden from users. Some techniques are like neural networks are able to find fraudulent transactions then update the database in turn update the patterns of the network.
Akhilomen, J., 2013. Data Mining Application for Cyber Credit-card Fraud Detection System. In Proceedings of the World Congress on Engineering 2013. London, UK 3-5 July 2013.
Chaudhary, K., Mallick, B. & Noida, G., 2012. Credit Card Fraud’: The study of its impact and detection techniques.International Journal of Computer Science and Network, 1(4), pp.2’6.
Delamaire L., Abdou H., Pointon J., 2009. Credit card fraud and detection techniques’: a review.Banks and Banks Syatems , 4(2), pp. 57-68.
Dheepa, V. & Dhanapal, R., 2009. Analysis of Credit Card Fraud Detection Methods. International Journal of Recent Trends in Engineering , 2(3), pp.126’128.
Dobbie, G., Riddle, P. & Naeem, M.A., 2010. A Swarm Intelligence Based Clustering Approach for Outlier Detection. IEEE,.
Ganji, V.R., 2012. Credit card fraud detection using anti-k nearest neighbor algorithm. International Journal on Computer Science and Engineering, 4(06), pp.1035’1039.
G??nther, F. & Fritsch, S., (2010). Neuralnet’: Training of Neural Networks. Contributed Research Articles, 2(1), pp.30’38.
Modi, H., Lakhani S ., Patel N., Patel V., 2013. Fraud Detection in Credit Card System Using Web Mining. International Journal of Innovative Research in Computer and Communication Engineering, 1(2), pp.175’179.
Ogwueleka, F.N., 2011. DATA MINING APPLICATION IN CREDIT CARD FRAUD DETECTION SYSTEM. Journal of Engineering Science and Technology, 6(3), pp.311’322.
Raj, S.B.E., Portia, A.A. & Sg, A., 2011. Analysis on Credit Card Fraud Detection Methods. In International Conference on Computer, Communication and Electrical Technology March 2011. pp.152’156.
Sahin, Y., Bulkan, S. & Duman, E., 2013. Expert Systems with Applications A cost-sensitive decision tree approach for fraud detection. EXPERT SYSTEMS WITH APPLICATIONS, 40(15), pp.5916’5923.
Sherly, K.K., 2012. A COMPARATIVE ASSESSMENT OF SUPERVISED DATA MINING TECHNIQUES FOR FRAUD PREVENTION. International Journal of Science and Technology Research,1, pp.1’6.