Sentiment Analysis for Financial Applications: Progress over the holidays

Monday, August 9, 2010

Progress over the holidays

I have been hard at work and finally got some machine-learning results from the corpus data.

I have parsed each of the 60,000 abstracts through C&C and BOXER, and have obtained some promising results. For the moment, I am just using their parts of speech, entity recognition tags, and words/lemmas from the C&C xml output. I prepare each of the abstracts by combining the abstract information in the CSV file and the parsed xml, find the words lemma, some currency operations, sentiwordnet scores and then serialise the object. After that, I can quickly load the serialised objects and analyse them (10,000 takes approx 2 minutes for analysis). I then write the features and annotations to a file for SVM Multiclass to learn from. Using 10,000 annotated abstracts, I split the data into 20% test, 80% train. I will soon implement code so I can do a 10-fold cross-validation (splitting the data 10 times into 10%/90% parts). This will get me more rounded results.

Current results are:

per class precision: 0.577354048964 overall precision: 0.581120943953

per class recall: 0.558885065227 overall recall: 0.581120943953

per class fscore: 0.542291224417 overall fscore: 0.581120943953

Confusion matrix:

annotation->	pos	neg	neutral
pos	483	90	71
neg	48	164	28
neutral	210	121	141

So with an F-score of 54% there is still a lot of improvement needed to obtain more substantial results. More experimenting with the features is needed - at the moment I'm using the top 1000 unigram, bigram and trigram words, and finance, economic and accounting gazetteers for feature inclusion. More work is need on the contextual features around these important terms.

2 comments:

UnknownSeptember 3, 2015 at 12:55 AM
Hi Andrew,
I am working in a different topic which has some connection with your work.So far my understanding is for doing sentiment analysis of financial new you need a finance specific corpora.Can you let me know if you are working with any such publicly available corpora.
ReplyDelete
Replies
India Business StoryJune 7, 2019 at 12:24 AM
Thank you for sharing the great useful information. You explained it very well. Also get to know about

business related news
recent business articles
most trending news
latest news headlines for today
financial news headlines
ReplyDelete
Replies

Add comment

Sentiment Analysis for Financial News

The primary area of research in Sentiment Analysis has involved movie and product reviews, and typically utilizes blogs and social media. Studies have shown that both informational and affective aspects of news text affect financial markets in profound ways, impacting on trade volumes, stock prices and volatility. Sentiment polarity identification within financial news articles has been explored previously, however have typically used small amounts of corpora data to train classifiers, and few have been compared with quantitive stock data.

Sentiment Analysis for Financial Applications

Monday, August 9, 2010

Progress over the holidays

2 comments:

Labels

Sentiment Analysis for Financial News

Contributors