Full Text Article

Stock Price Trend Forecasting Using Machine Learning Technique

Received Date: May 06, 2022 Accepted Date: June 15, 2022 Published Date: June 17, 2022

doi: 10.17303/jcssd.2022.2.301

Citation: Pitchaimanickam B, Jhansi V, Sri Lakshmi A, Yashoda K (2022) Stock Price Trend Forecasting Using Machine Learning Technique. J Comput Sci Software Dev 2: 1-8.

Foreseeing the stock cost pattern by deciphering the appropriate tumultuous market information has continuously been an appealing point to the two financial backers and scientists. Among those famous strategies that have been utilized, Machine Learning procedures are extremely famous due to the limit of distinguishing stock patterns from enormous measures of information that catch the hidden stock cost elements. In this venture, we applied administered learning techniques to stock cost pattern gauging.

As indicated by the market productivity hypothesis, the US securities exchange is a semi-solid proficient market, and that implies all open data is determined in a stock’s ongoing offer cost, implying that neither major nor specialized examination can be utilized to accomplish predominant gains in a present moment (a day or seven days). Without a doubt, our underlying following-day predication has very low exactness around half. Nonetheless, as we attempted to anticipate long-haul stock cost patterns, our models accomplished a high precision (79%). In light of our expected result, we constructed an exchanging procedure on the stock, which fundamentally surpassed the stock execution itself

Keywords: Stock Prediction, Machine Learning (ML), Regression

As of now, The securities exchange vacillates quickly, and there are various complex monetary markers. Notwithstanding, the Machine gaining progressions give a chance to benefit reliably from the financial exchange and can likewise help experts in recognizing the most valuable signs to improve forecasts [1]. The capacity to gauge market esteem is basic to boost benefits.

Monetary models have been utilized by venture organizations, mutual funds, and even people to more readily get market conduct and make productive speculations and exchanges. Recorded stock costs and corporate execution information give an abundance of data ideal for AI calculation to process. Indeed, even AI programs find it hard to foresee future evaluation [2].

There are easier inquiries to address that might give benefits too. Will the upcoming shutting cost, for instance, be higher than the present shutting cost?

AI depends on information. The strategy involves distinguishing an advantageous exchange and afterward querying ML model-fitting interaction in search of designs in the information [4].

Under correlation, rule-based exchanging frameworks have grown beforehand. Such a technique involves registering some pointers and afterward holding on to see what happens. Practically numerous of the frameworks, the outcome are single choice trees. More confounded AI models generally outflank straightforward relapse models in AI challenges and in exchanging [5].

System Architecture:

The above flow chart tells us about how it works and the process of its step-by-step working. Initially, the dataset is uploaded later the data is cleaned that id the unwanted data is removed and then split into training and testing categories. In the training part, the data is trained with the supervised learning technique. And the testing part is tested and compared the final results.

Previous Work

As indicated by the author This segment makes sense of the general course of the writing assortment on SMP utilizing AI. At first, the expression “securities exchange forecast utilizing AI” was keyed to different web indexes, advanced libraries, and data sets, including ‘google researcher’, ‘research entryway’, ‘ACM computerized library’, ‘IEEE Explore’, ‘Scopus, etc., [3]. During the writing assortment, different expressions like “financial exchange forecast techniques”, “effect of opinions on financial exchange expectation”, and “AI-based approach for financial exchange forecast” were keyed. The OR AND administrators were utilized for the watchword look in single and different classes, individually. Accordingly, a portion of the essential papers in the field of financial exchange expectations was recovered [7]. Through the cautious examination of a couple of fundamental papers, essential knowledge of space was gotten. The pursuit rules were additionally adjusted to gather the writing of the last ten years, to upgrade and work on the space. Also, the writing was screened by applying quality models, where measurements, for example, ordering, quartiles, sway variables, and distributers were noticed. Figure 2 presents the means continued in the writing assortment

SMP frameworks can be characterized by the sort of information they use as the information. The vast majority of the investigations involved market information for their examination. Ongoing examinations have thought about printed information from online sources also. In this part, the investigations are arranged because of the kind of information they use for expectation purposes [6].

Proposed System

a) Data discretization: Here, in the wake of acquiring the datasets we are changing over consistent information quality qualities into a limited set of spans, and connecting every stretch with specific information esteem.mans to relieve and cure many diseases [28]. Today, in many parts of the world traditional medicine replaces conventional medicine [29]. With multiple biological activities, many medicinal plants have antioxidant activity that is attracting more and more the attention of several research teams for its role in the fight against several diseases such as cancer, the atherosclerosis, cerebral cardiovascular events, diabetes, hypertension, and Alzheimer’s disease [30,31].

b) Data change: Using such strategies, the dataset is stacked into undertaking has changed over into object design for pre-handling.

c) Data Cleaning: Data, which changed is checked to gather mistakes. Excess information is eliminated (Standardization) among a given- dataset. Later the qualities are not required.

d) Data Integration: The information and the data later preprocessing is parted into preparing and testing information for expectation, after which result is gotten.

The below-mentioned algorithms are used for stock prediction:

➔ Linear Regression.

➔ Random Forests.

➔ K Nearest Neighbor (KNN).

Linear Regression

Direct relapse is an important apparatus for specialized and quantitative investigation in monetary business sectors since it analyzes two unique factors to decide on a solitary relationship.

Merchants can identify when a stock is overbought or oversold by plotting stock qualities along with a typical circulation (ringer bend) [8].

A merchant can utilize straight relapse to find pivotal price tags like section, stop-misfortune, and leave costs. The framework boundaries for straight are not set in stone by the cost and time period of a stock, making the methodology by and large relevant.

The above graph is representing the data points and the line of regression having price and dates as its parameters. The points which are near the regression line are considered and the points which are far from the line are not taken and removed.

Random Forest

Outfit learning strategies are utilized to make irregular woods. Gathering just alludes to a gathering or an assortment, for this situation a gathering of choice trees together is alluded to as an irregular woodland[9]. Gathering models are more exact than individual models since they join the aftereffects of the numerous models to give the last end. An interaction called bootstrap accumulating or sacking is used to choose highlights indiscriminately. Various preparation subsets are shaped from the dataset’s assortment of elements by choosing arbitrary highlights with substitution[10]. This implies that a solitary component might show up in many preparation subsets simultaneously

In case, the dataset involves 20 highlights, subsets of 5 highlights are to be decided indiscriminately to develop unmistakable choice trees, these 5 elements will be picked aimlessly, and each component can be important for greater than a single subset [11]. This guarantees flightiness, lessening the connection between the trees and henceforth forestalling over fitting.

The trees are assembled given best parted after the highlights have been picked. Each tree creates a result, which is viewed as a “vote” from that tree for that result. The irregular woodland picks the last result/result that gets the most “votes,” or on account of continuous factors, the normal of the relative multitude of results are viewed as the last result.

K Nearest Neighbor (KNN)

K-Nearest Neighbour is the most simple ML algorithm which was organized on the supervised Learning technique [13]. This acquires the resemblance among the recent case/data and mentioned cases and put the new case into the category that is most similar to the available categories. K-NN algorithm assumes complete available data and classifies a new data point based on the similarity. This gives a meaning of new data appears then it could be easily classified into a good suite category by using K- NN algorithm [12].


Random Forest algorithm

Random Forest is an adaptable, easy-to-use AI calculation that produces, without hyper-boundary tuning, an incredible outcome more often than not. It is likewise perhaps the most utilized calculation, given its effortlessness and variety (it very well may be utilized for both order and relapse errands) [16]. In this post, we’ll figure out how the arbitrary woodland calculation functions, how it contrasts with different calculations, and how to utilize it.

The Following Code will give a brief idea of how we can use the algorithm in python.

importnumpy as nm
importmatplotlib.pyplot as mtp
import pandas as pd
#importing datasets
data_set= pd.read_csv(‘user_data.csv’)
#Extracting Independent and dependent Variable
x= data_set.iloc[:, [2,3]].values
y= data_set.iloc[:, 4].values
# Splitting the dataset into training and test set.
Fromsklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25, random_state=0)

Python provides the predefined library from SKLEARNS. we can import all the algorithms and train them.

Project Results

Our project is generating the following results.

The fig3 shows the parameters used in the dataset. The open parameter is the opening stock value of the day. High is the highest stock value of that day. Low is the lowest stock value of that day. Close is the last that is the closing stock value of that day.

Fig 4 is the graph of the closing stock price values for 1200 days. This graph shows the closing price alternation for a particular period.

Fig 5 is the graph representing the comparison of the coefficient of determination between different algorithms. From the above graph, we can say that linear regression is the best and gradient boosting is the worst algorithm.

Fig 6 is the graph representing the comparison of mean square error between different algorithms. From the above graph, we can say that linear regression is the best because it has the least mean square error and gradient boosting is the worst algorithm. After all, because it has the highest mean square error.

At last, stock cost pattern anticipating is utilized to gauge the heading of monetary development. For monetary estimating, relapse is a promising strategy. Each approach, however, has own arrangement of benefits and drawbacks. The pointer work utilized can have a huge effect on the expectation framework’s exactness. Likewise, a specific Machine Learning Algorithm might be more qualified for a particular sort of stock, yet all at once, the equivalent calculation might gauge different kinds of stocks with lesser exactness [15].

Different directed learning models have been utilized for the expectation and we observed that the SVM model can give the most elevated foreseeing precision (79%), as we foresee the stock cost pattern in a drawn-out premise (44 days). Our include determination examination demonstrates that when utilizing every one of the 16 elements, we will get the most elevated precision. That is because the quantity of information focuses is a lot greater than that of the elements [14]. The exchanging technique given our forecast accomplishes extremely sure outcomes by fundamentally surpassing the stock execution.

  1. Jagwani J, Gupta M, Sachdeva H and Singhal A (2018)Stock Price Forecasting Using Data from Yahoo Finance and Analysing Seasonal and Nonseasonal Trend. 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS) 462-467.
  2. Park JS, Sung Cho H, Sung Lee J, Chung KI, Kim JM and Kim DJ (2019) Forecasting Daily Stock Trends Using Random Forest Optimization. 2019 International Conference on Information and Communication Technology Convergence (ICTC) 1152-1155.
  3. Powell N, Foo SY and Weatherspoon M (2008) Supervised and Unsupervised Methods for Stock Trend Forecasting. 2008 40th Southeastern Symposium on System Theory (SSST) 203-205.
  4. Yao S, Luo L and Peng H (2018) High-Frequency Stock Trend Forecast Using LSTM Model. 2018 13th International Conference on Computer Science & Education (ICCSE) 1-4.
  5. Dinesh S and SR (2021) Prediction of Trends in Stock Market using Moving Averages and Machine Learning. 2021 6th International Conference for Convergence in Technology (I2CT) 1-5.
  6. Lumeng Chen (2021) Predicting Stock Prices Using Machine Learning Techniques. 2021 6th International Conference on Computation Technologies (ICICT) 1-5.
  7. Vazirani S, Sharma A and Sharma P (2020) Analysis of various machine learning algorithm and hybrid model for stock market prediction using python. 2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE) 203-207.
  8. Parmar I (2018) Stock Market Prediction Using Machine Learning. 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC) 574-576.
  9. GarejaPradip, Chitrak Bari, Shiva Nandhini J (2018) Stock market prediction using machine learning. International Journal of Advance Research and Development 3:10.
  10. Raza K (2017) Prediction of Stock Market performance by using machine learning techniques. 2017 International Conference on Innovations in Electrical Engineering and Computational Technologies (ICIEECT)
  11. HibaSadia K, Aditya Sharma, Adarrsh Paul, Sarmistha- Padhi, SauravSanyal (2018) Stock Market Prediction Using Machine Learning Algorithms. International Journal of Engineering and Advanced Technology (IJEAT) 8: 4.
  12. Raut Sushrut Deepak, Shinde Isha Uday, Malathi D (2017) Machine Learning Approach in Stock Market Prediction. International Journal of Pure and Applied Mathematics 115: 8
  13. Mehtab S, Sen J. A robust predictive model for stock price prediction using deep learning and natural language processing. In: Proceedings of the 7th International Conference on Business Analytics and Intelligence
  14. Usmani M, Adil SH, Raza K and Ali SSA (2016) Stock market prediction using machine learning techniques. 2016 3rd International Conference on Computer and Information Sciences (ICCOINS).
  15. Tang J, Chen X. Stock market prediction based on historic prices and news titles. In: Proceedings of the International Conference on Machine Learning Technologies (ICMLT)
  16. Ashish Sharma, Dinesh Bhuriya, Upendra Singh (2017) Survey of Stock Market Prediction Using Machine Learning Approach. ICECA.
CommentsFigure 1 CommentsFigure 2 CommentsFigure 3 CommentsFigure 4 CommentsFigure 5 CommentsFigure 6