We are really proud of our co-founder(联合创始人 lián hé chuàng shǐ rén), Sarah Chen, for her new book release, which is now public. Congratulations!
Her book is available in various outlets:
Here is her pre-release announcement on LinkedIn in July this year.
Today in class we talked about money and how they are transacted.
Two acronyms we introduced are “DC” and “EP”, meaning Digital Currency and Electronic Payment.
An important piece of recent financial news that you might have missed is that the Chinese central bank (央行） has issued its own digital currency, saying that its design is similar to Facebook’s proposed cryptocurrency Libra.
❗Digital currency is a big deal.
It not only affects how people go about their daily lives, but also world politics.
For example, we know that the United States sanctions certain countries by blocking them from the banking system.
❓ But what if these countries and the rest of the world start to use digital currency as the world currency instead of relying on the US dollar?
That would be really bad for the US, unless we become the leader in this space and catch up where we have fallen behind.
For the young, it is time to learn how DC and EP work.
Today in class we talked about autumn 秋(qiū)天(tiān) in this weekend’s Chinese lesson.
Fall is here. Fall is good.
Blue sky and white clouds.
The baby looks up and laughs.
Besides 中文, we are also working with other organizations to help children with computer programming 编程.
Today, we focused our class on decision tree. Decision tree is a way to organize data. You can look at it this way: you ask a bunch of questions and make a bunch of decisions, and organize data based on these decisions.
For example, if our data consists of colors and shapes of 3 pieces of fruits. We have 1 yellow apple, 1 red apple and a yellow banana. We have two features: shape and color.
By organizing our data, we can identify types of fruit. We go through our data on shape and color one by one. If we first organize our data by color, we know that will incorrectly group the yellow apple with the yellow banana. But if we first organize our data by shape, that will right away group apples and banana separately. So we organize this data by shape, and then by color (if we want to make a distinction between yellow apple and red apple).
The way to organize data may be (highly likely) different for another dataset of fruits. But you get the point: we organize data to best group things. In each step of the way, our data gets more organized. The “energy distribution” has become lower entropy.
That was a classification tree model.
When we have lots of decision trees for different random parts of a larger data, we have the so-called “random forest” 随机森林 model, originated by Leo Breiman.
We showed in class how to code a decision tree from scratch. Here is a shorter version using Python sklearn library.
# making up data >>> training_data = np.array([ >>> [1, 1], >>> [2, 1], >>> [1, 0], >>> ]) # Yellow = 1, Red=2 # round =1, oblong = 0 >>> from sklearn import tree >>> data = np.array(['Apple','Apple','Banana']) >>> data_names= ["color", "shape"] >>> fruit_names = ['Apple', 'Banana'] >>> clf = tree.DecisionTreeClassifier() >>> clf = clf.fit(training_data, data) >>> tree.plot_tree(clf.fit(training_data, data)) # visualize tree >>> import graphviz >>> dot_data = tree.export_graphviz(clf, out_file=None) >>> graph = graphviz.Source(dot_data) >>> graph.render("fruit") >>> dot_data = tree.export_graphviz(clf, out_file=None, ... feature_names=data_names, ... class_names=fruit_names, ... filled=True, rounded=True, ... special_characters=True) >>> graph = graphviz.Source(dot_data) >>> graph.render()
“Nothing is lost, nothing is created, everything is transformed.”
― Antoine Lavoisier (August 1743 – 8 May 1794)
Unlike before, we started the class today with a quote. This is because it is really difficult to talk about entropy, and we made many analogies (such as water flows from high to low, a mirror broken never, or almost never, returns to whole again) to bring our attention to how things work in daily life that we have taken for granted.
Some theory/hypothesis says that the universe started with Big Bang, a state with very low entropy. There are many states of high entropy than low entropy (imagine 10…000 to 1). So we will have to cycle through lots of high entropy states before it is low again. Well, we only have barely touched the topic. Whereas our true goal is to talk about the so-called decision tree model, which we will cover tomorrow.
To help you remember the word “entropy” and its meaning (as if we knew!), “en” comes from “energy”. “tropy” means “transfom”, and comes from Latin.
Entropy is a measure of the number of possible ways energy can be distributed in a system.
By the way, Lavoisier 拉瓦锡 was a great chemist.
In today’s class we deviated from our normal computer work and went in further on one of the stories told by Terence Tao in the “Cosmic Distance Ladder” video. This particular story was about Tycho Brahe and Johannes Kepler.
It was a very important story of data and analysis.
Tycho Brahe and Johannes Kepler had totally disparate backgrounds and temperaments.
In spite of this, Tycho’s painstaking and detailed observational data of the planet Mars, combined with Kepler’s mathematical genius, allowed Kepler to derive the three laws of planetary motion. Both Tycho Kepler’s 3 Laws of Planetary Motion and Kepler made significant contributions to the change in the prevailing world view of a geocentric universe. It was the beginning of a systematic study that transformed Medieval thinking – alchemy became chemistry and astrology led to astronomy.
Link for those students who can read in Chinese:
After we discussed logarithm (‘log’) last week, we explored a bit on some commonly used methods that have log embedded in them. For example, the logit function, or logit transform (using the “natural” logarithm). We explained its definition by the following Python code.
>>> def logit(c):
>>> d = np.log((c+epsilon)/(1+ epsilon-c))
>>> return d
The following is the inverse, which is to bring what was transformed back to what it was before.
>>> def inverse_logit(a):
>>> b = ((1+ epsilon)*np.exp(a) – epsilon)/(np.exp(a)+1)
>>> return b
>> print(logit(0.1)) #-2.1883847407670785
>>> print(inverse_logit(logit(0.1))) #0.09999999999999999
It is much more revealing on what the logit transform is doing by looking at some pictures of how this works. See how fast when it is transformed! Why it is stretched instead of being shrunk? We know that taking log is to do division multiple times (recall log10 of a number is how many times it needs to divide by 10 in order to become 1). But when it applies to numbers between 0 and 1, it gives us the opposite effect. A positive small number less than one has to divide by 10 negative times to become 1. For example, 0.01 needs to be divided by 10 negative two times to be restored to 1. That’s why you see that y axis we have negatives.
On the other hand, we also have positives in the y-axis. That’s because about half of the numbers
(c+epsilon)/(1+ epsilon-c) (the odds) are large positive numbers. Play around with it and you will surely get it.
>> x = np.linspace(0,1,1000)
>>> y = logit(x)
>>> plt.scatter(x=x, y=y, alpha=0.3)
>>> title =”logit transform”
>>> plt.xlabel(“numbers between 0 and 1 (inclusive”)
>>> plt.ylabel(“after logit transform”)
>>> plt.xlim(-6, 6)
>>> plt.ylim(-6, 6)
>>> plt.gca().set_aspect(‘equal’, adjustable=’box’)
Look at the same plot with the axis scaled differently:
We learned the phrase 错题本 this past summer in China attending two-week XDF summer classes. It means “wrong solution notes” or “error management book”, meant for you to keep track of where your mistakes are and to make sure they do not happen again. This is used in all classes we attended: math, Chinese and English. They are all formatted in similar templates for documenting: date, problem, error, why it happen, and what is the correct answer. Doing this systematically helps focus on the target areas and improves efficiency. Basically you are customizing the learning for yourself. Those so-called artificial intelligence guided learning will not do better than this.
To dedicate a specific notebook to your mistakes is quite a good idea. It may seem unusual for those who like to move fast and break things. But this is definitely indispensable when we are building a skill.
Who says that error management book or notebook of mistakes should only be applicable to school kids? Could we also expand that concept to life and work? Probably.
Let’s go get a 错题本 and get it started today!
The class has no homework today. We watched the video lecture by Terence Tao (see link below). The name of the video is “Cosmic Distance Ladder”. Quite a mystifying name.
The stories, which Terence Tao told in the lecture, were about philosophers and astronomers from ancient times, such as Aristotle and others, and those who were closer to us in history. What all of them have in common is that they were able to use good observations and ingenious reasonings to indirectly measure the distance between the Earth and the Moon, and the Sun, and the distance of the galaxies, without any technology (the earliest did not even know the number Pi), with amazing accuracy (as verified by what we know today).
You should definitely watch the video a few times. Think about this: compare with human observation and reasonings, what computers can do is still just technology and tools. The computers can’t do indirect reasonings that connect the dots from disparate information. It makes zero sense to believe computers (including phones) are smarter than you are.
So, use your great mind. Let your mind observe and reason, and make computers help you along the way.