Monday I presented the work we have been doing for over 1 year. We are using dynamic topic modeling and cross-domain correlations to understand how climate change research is influencing the Intergovernmental Panel for Climate Change Assessments.
Dynamic Topic Modeling to Infer the Influence of Research Citations on IPCC Assessment Reports
This work is just getting started, it took us a year to get the IPCC documents and citations parsed and processed, the climate change glossaries built, the preprocessing steps to get the best out of the topic modeling, and parameter tweaking. Now we are going to start large scale experimentation.
Thanks to the best advisors ever, Dr. Finin, Dr. Halem and thanks to Dr. Cane, who is a great scientist to work with. It is a honor to work with such great people.
Thursday, December 8, 2016
Tuesday, April 12, 2016
Friday, April 1, 2016
Image Thresholding in Python
I found this article to be very useful.
http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_imgproc/py_thresholding/py_thresholding.html
http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_imgproc/py_thresholding/py_thresholding.html
Distilling the Knowledge in a Neural Network
http://arxiv.org/abs/1503.02531
A very simple way to improve the
performance of almost any machine learning algorithm is to train many
different models on the same data and then to average their predictions.
Unfortunately, making predictions using a whole ensemble of models is
cumbersome and may be too computationally expensive to allow deployment
to a large number of users, especially if the individual models are
large neural nets. Caruana and his collaborators have shown that it is
possible to compress the knowledge in an ensemble into a single model
which is much easier to deploy and we develop this approach further
using a different compression technique. We achieve some surprising
results on MNIST and we show that we can significantly improve the
acoustic model of a heavily used commercial system by distilling the
knowledge in an ensemble of models into a single model. We also
introduce a new type of ensemble composed of one or more full models and
many specialist models which learn to distinguish fine-grained classes
that the full models confuse. Unlike a mixture of experts, these
specialist models can be trained rapidly and in parallel.
Data Exploration with Kaggle Scripts, Data Science, Data Exploratory Courses
This might be interesting at a surface level. I haven't evaluated yet.
Data Exploration with Kaggle Scripts course.
Again more surface level stuff.
Intermediate Python for Data Science course.
This actually might have more substance, it is taught by a JHU professor.
Coursera course on Exploratory Data Analysis
Again more surface level stuff.
This actually might have more substance, it is taught by a JHU professor.
Coursera course on Exploratory Data Analysis
Labels:
Courses,
Data Analysis,
data exploration,
Data Science,
kaggle,
Python
Thursday, March 31, 2016
A Survey of Graph Theory and Applications in Neo4J - Talk
This is a link to a talk given at a recent meet-up in Arlington, VA.
The talk starts out with pretty introductory material but as it progresses it gets more interesting. Definitely worth a read during a treadmill session.
Here is another relevant link.
My opinion of Neo4J after using for 1 year for experimental purposes is that it is a decent application but I highly doubt its scalability for big data. I never tested this but it is a hunch based on my use.
Also if you are using Neo4J to store triples, no, don't do that, it is way too much work. Just use a triple store.
The talk starts out with pretty introductory material but as it progresses it gets more interesting. Definitely worth a read during a treadmill session.
Here is another relevant link.
My opinion of Neo4J after using for 1 year for experimental purposes is that it is a decent application but I highly doubt its scalability for big data. I never tested this but it is a hunch based on my use.
Also if you are using Neo4J to store triples, no, don't do that, it is way too much work. Just use a triple store.
Monday, March 28, 2016
CS231n: Convolutional Neural Networks for Visual Recognition Winter Course Project Report
There are lots of interesting reads on this page. And this is a great course to take if you are research deep learning for image processing.
Tuesday, March 22, 2016
Sunday, March 20, 2016
3 Minute Thesis Competition - 3MT
Can you explain your dissertation in 3 minutes?
UMBC has a 3MT competition this Wednesday.
BALTIMORE, MD
If you are preparing for a 3MT, this is a good resource.
Other good 3MT videos:
2010 Trans-Tasman 3MT Winner - Balarka Banerjee from Three Minute Thesis (3MT®) on Vimeo.
UMBC has a 3MT competition this Wednesday.
BALTIMORE, MD
If you are preparing for a 3MT, this is a good resource.
Other good 3MT videos:
2010 Trans-Tasman 3MT Winner - Balarka Banerjee from Three Minute Thesis (3MT®) on Vimeo.
Thursday, March 17, 2016
Markdown
I am starting to use markdown more. For me, I wanted to know why I should care about markdown. This article gives a good view of why to use and there is a link to a tutorial.
Read it here.
Read it here.
Tuesday, March 15, 2016
Flask
Flask...
"Flask is a microframework for Python based on Werkzeug, Jinja 2 and good intentions."
I toyed around with the tutorial and was able to get a few simple apps running. I suppose if you are interested in building web sites, this might be interesting to try. I haven't determined if it is useful for anything else.
http://flask.pocoo.org/
"Flask is a microframework for Python based on Werkzeug, Jinja 2 and good intentions."
I toyed around with the tutorial and was able to get a few simple apps running. I suppose if you are interested in building web sites, this might be interesting to try. I haven't determined if it is useful for anything else.
http://flask.pocoo.org/
Monday, March 14, 2016
Spring Break.....
I love working on campus during spring break. Front space parking, empty lab, no line for coffee.....ahhh nirvana....
No coffee! Bah!
No coffee! Bah!
Thursday, March 10, 2016
Ugh, dissertation
In those moments when you are frustrated with your dissertation, breathe, and know there are others feeling the same pain.....
The valley....
Ride the wave to finish this thing...
The valley....
Ride the wave to finish this thing...
Tuesday, March 8, 2016
DL4J
I have been using Java for a long time but I find DL4J to be a bit cumbersome to use. I prefer Torch/Lua or Theano for deep learning.
However because Java has been such a significant part of my life for so long, I will not give up on DL4J.
More to come once I get this working.
In the meantime, here are a few links, so I can close those tabs:-)....
word2vec in DL4J
deep autoencoders in DL4J
nd4j
However because Java has been such a significant part of my life for so long, I will not give up on DL4J.
More to come once I get this working.
In the meantime, here are a few links, so I can close those tabs:-)....
word2vec in DL4J
deep autoencoders in DL4J
nd4j
I like popcorn and I like bag of words
I think I am going to like this too.
Beginners, may not be very useful, but they said popcorn, so they have my attention....
Beginners, may not be very useful, but they said popcorn, so they have my attention....
Labels:
bag of words,
Data Science,
Tutorial,
word2vec
matplotlib
Examples using matplotlib.
Tutorial for matplotlib.
A little bit on density plots.
And an introduction to plotting in Python.
Labels:
density plots,
matplotlib,
plots,
Python,
Tutorial
Thursday, March 3, 2016
Deep Learning Resources
Great Papers:
http://www.iro.umontreal.ca/~ bengioy/papers/ftml.pdf
http://deeplearning.net/ reading-list/
http://www.iro.umontreal.ca/~
http://deeplearning.net/
Tutorials:
TensorFlow:
https://www.tensorflow.org/
Theano:
http://deeplearning.net/
http://deeplearning.stanford.
Important Names and associated tutorials/talks:
Hinton:
https://www.cs.toronto.edu/~ hinton/nntut.html
LeCun:
http://www.cs.nyu.edu/~yann/ talks/lecun-ranzato-icml2013. pdf
Socher:
http://www.socher.org/index. php/DeepLearningTutorial/ DeepLearningTutorial
https://www.cs.toronto.edu/~
LeCun:
http://www.cs.nyu.edu/~yann/
Socher:
http://www.socher.org/index.
Common Datasets:
IMAGENET - http://www.image-net.org/ challenges/LSVRC/
Courses:
https://www.udacity.com/ course/deep-learning--ud730 - Basic but uses TensorFlow, good to get a basic understanding
https://www.coursera.org/ course/neuralnets - Provides great intuition, a little more challenging
https://cs231n.github.io/ - Great for understanding deep learning for images
https://www.udacity.com/
https://www.coursera.org/
https://cs231n.github.io/ - Great for understanding deep learning for images
Monday, January 25, 2016
Subscribe to:
Posts (Atom)