Sustainability Analysis Project
supervised learning
This comprehensive analysis explores public discourse and scientific understanding of climate change through advanced text mining techniques. By analyzing diverse data sources, including social media discussions, scientific publications, and news articles, this project aims to uncover patterns in how different communities perceive and discuss climate-related issues.
I want to preface that I had spent lots of time experiementing with these three algorithms attempting to make a decent model, none of them at this point, in my opinion are good enough. The accuracies are abysmal. I think that there could be specific actions to take to refactor this project to set this up on a trajectory with more success in this area. Also, If you'll kindly refer to the clustering page, it is evident that much of this data is very similar, and very complex.
Due to a mixture of online natural language, sarcasm(among other rhetorical devices), and nuaces in human speech, while these results are not what I was hoping for. Somebody might have said: "The absence of findings, in it of itself, are findings". Maybe I just said it, doesn't make it any less true.
Project Components
Naive Bayes (Multinomial)
Naive Bayes is a family of probabilistic algorithms based on Bayes' theorem, used for classification tasks. The Multinomial Naive Bayes variant is particularly effective for text classification, where the features are the frequencies of words in the documents.
Decision Tree
A Decision Tree is a supervised learning algorithm used for classification and regression tasks. It splits the data into subsets based on the value of input features, creating a tree-like model of decisions.
Support Vector Machines
Support Vector Machines (SVM) are supervised learning models used for classification and regression analysis. They work by finding the optimal hyperplane that separates data points of different classes in a high-dimensional space.
look at my code and data collected on my github here:
GitHub Code