market stock predıctıon

The stock market is known for being volatile, dynamic, and nonlinear for all around the word. Stock price prediction is extremely challenging because of multiple factors, such as politics, global economic conditions, unexpected events, a company’s financial performance, and so on.

But, all of this also means that there’s a lot of data to find patterns in. So, financial analysts, researchers, and also data scientists keep exploring analytics techniques to detect stock market trends.

In this project, We created a model which has ability to predict stock price of a company with using historical time series stock price data. LSTM is a perfect detector for finding some pattern from time series data, so LSTM model is used for this task.

The dataset consists some historical daily stock price values (Open, high, low, close) and volume information which is between ‘2005-2-25’ and ‘2017-11-10’ belongs to ABE.US company.

Daily close stock price graph used as Dataset

MAIN STEPS OF THE PROJECT:

Preprocessing:

All columns are checked if there is any missing values.
Datatypes of columns are checked to convert to convenient datatype. (object to datetime)
Redundant columns are detected and removed from the dataset.
Scaling is applied to all dataset values except predictor column for ensure normalization.

2. Modeling:

After scaling process, normalized dataset is separated as train and test dataset. (0.8, 0.2)
All words in corpus are represented with a unique number randomly and defined in dictionary.
Since the order and position of the market stock prices keep hidden information to great predictions, the LSTM model was used, which is highly suitable for data containing a series.
LSTM model is configured. Components of the model decided in this configuration are:

- type of layers (LSTM and linear)

- number of layers (2 LSTM layer, 1 dense layer)

- dimension of layers (ex: how many neurons embedding layer has(60->60->20->1))

- type of activation functions (last neuron is activated by regression, because last neuron(output) must give continuous numeric value (predicted price value))

Hyperparameters are determined. These are:

- Learning rate (0.001)

- Batch size (64)

- Epochs (1)

- Optimizer function (Adam Optimizer)

- Loss function (Mean Squared Error)

3. Training and Evaluation:

After the weight update process in each epoch, the trained model was evaluated with test dataset by root mean squared error metric.

Prediction values are non-sense, because model is trained by dataset that has scaled values.

So predicted values are scaled inversely by using same pre-trained scaling model. (robust)

Finally, we can see the real market price values and predicted ones. It is clear that prediction and real values are highly correlated.

< View Code

Back to Projects