So you see, research can oftentimes be frustrating if you expect results within a short timeframe. This is partially because data collection can take a while, but also because the world of research is not quite as efficient as your average tech company. Recently, I discovered that the work I’d been doing for the last 2.5 weeks had already been done a few years ago by another research group! That means that I wouldn’t be able to get a paper out of it, so I had to start over with something new. It’s not the hugest issue because even if I wasn’t doing anything useful for the past 2.5 weeks, I was certainly getting practice in the Python language!
My task for the next four weeks is to read a bunch of papers on natural language processing programs! This is actually a lot of fun, because most of the papers are quite interesting. Something interesting I learned: I have now reached the stage in my education where I can read a paper called “Efficient Estimation of Word Representations in Vector Space” and be very excited and interested in its contents. Maybe this is why I don’t get invited to parties? Just kidding, the real reason is that I’m an engineering major and don’t have time to have fun. The only issue with this is that it means I will only have 9 weeks to do my actual project, but again, that’s just how research works sometimes and hopefully I’ll have enough time to create something useful for the lab! I don’t say this because I’m altruistic and want to help out the lab, I say this because I want to be a co-author on a paper.
A natural language processor, or NLP, is any program that analyzes human language. When you have a text editor such as google docs that predicts how you are going to finish your sentence, that is natural language processing. It typically works by using ‘word vectors’ which are large groups of integers that represent words. You can add and subtract them from each other to get interesting results! For example, for most well-designed models (of which there are a good number), if you take the vector for ‘king’, subtract the vector for ‘man’ and add the vector for ‘woman’, the resulting vector is SUPER DUPER CLOSE to the vector for ‘queen’. Closer than any other word! This is super cool.
There are two main methods: global matrix factorization, and local context window methods. Global matrix factorization is where you take an entire body of text and put it into a 2D graph, and then analyze the graph. Local context window methods are when you have a ‘window’ of 5 or so letters that you run across a body of text, and train the data off of all the little 5 word snippets of text that you got. Some are hybrids such as GloVe, but most are one or the other.
In addition, I’ve got my Raspberry Pi to communicate with every finger in my robotic hand! One issue that I had been facing was being completely unaware as to how servo motors are supposed to work. You see I didn't know that you had to screw the fingers into the motors. I didn't even know that you could put screws inside of the servo motors. I just thought it was weird that the fingers kept popping off. Eventually I figured it out, but it took a while which my housemate who is a biomedical engineer (basically a mechanical engineer with neuroscience on top) found very amusing. The code is stored here , and it’s heavily based on this code which I found online. I also built an arm and a base for the hand, which my professor agreed was very necessary.In non tech related news, ICE is now deporting international college students who are taking solely online classes. Hopefully UC Davis will come up with a way around this but nothing is certain yet. I certainly hope so or I might end up having to marry one of my intentional student friends to keep him from being deported!