Gilmer traveled to New York to attend a friend's wedding. While on the East Coast, Gilmer met his former adviser, Michael Saks, who was a mathematician at Rutgers University.
The two men didn't talk about math. Gilmer didn't think about math much after he finished Rutgers. He decided he didn't want a career in academics and began to teach himself to program. Gilmer told his mentor about his job at the search engine when they ate together.
On the day Gilmer was at Rutgers, it was sunny. He remembered how he spent a year walking those same paths thinking about a problem called the union- closed conjecture. Gilmer had only succeeded in teaching himself why the simple-seeming problem about sets of numbers was so hard to solve, despite all his efforts.
People think about the problem until they understand why it is difficult. Gilmer said he spent more time on it than most people.
He got a new idea after visiting. Gilmer started to think about ways to use information theory to solve the union-closed conjecture. The idea was pursued for a month and he expected it to fail. The path to a proof continued to open up. On November 16th he posted a first-of-its-kind result that gets mathematicians a lot of the way to proving the full conjecture.
There was a lot of follow-up work after the paper was published. The University of Oxford, the Massachusetts Institute of Technology, and the Institute for Advanced Study all built on Gilmer's methods. They asked a question of their own, "Who is this guy?"
Collections of numbers are called sets and are the subject of the union- closed conjecture. Combining them is one of the things you can do on sets. The union of 1, 2, 3, 4 is called 1, 2, 3, 4.
If the union of any two sets in the family equals any existing set in the family, a collection is closed. Consider this family of sets.
1, 2,,,,,,,,,,,,,,,,,,,,,,,,,,,
The family union is closed when you combine any pair with a set already in the family.
Péter Frankl, a Hungarian mathematician who migrated to Japan in the 1980s, gave the first formal statement on the union- closed conjecture in 1979.
Frankl thought that a family of sets must have at least one element in at least half the sets to be union- closed. There were two reasons for it to be a threshold.
There are many examples of union-closed families in which all elements are present in at least half of the sets. It's possible to make sets from the numbers 1 to 10. Each of the 10 elements is present in 512 of the 1,024 such sets. No one had ever come up with an example of a union-closed family that Frankl didn't hold.
It looked like 50% was the correct prediction.
It was difficult to prove. Since Frankl wrote his paper there have been few results. Prior to Gilmer's work, the papers only established thresholds that varied with the number of sets in the family, as opposed to being the same threshold for set families of all sizes.
It is similar to a lot of problems that are easy, but it has resisted attacks.
The fact that many mathematicians preferred not to think about the problem was reflected in the lack of progress. Gilmer remembers going to the office of Saks and bringing up the union- closed conjecture. He was almost thrown out of the room by his adviser, who wrestled with the problem himself.
Gilmer said that Mike told him that he wouldn't want to do that.
Gilmer tried to understand why it was so difficult after visiting Rutgers. If you have a family of 100 sets, there are 4,950 ways to choose two and take their union. He wondered if it was possible for 4,950 unions to map back onto just 100 sets if no element appeared in those unions.
He didn't know it at the time, but he was on his way to a proof. Information theory gives a rigorous way of thinking about what to expect when you pull a pair of objects at random.
Claude Shannon wrote a paper in 1948 about a mathematical theory of communication. The paper gave a precise way of calculating the amount of information needed to send a message, based on the amount of uncertainty surrounding the message. The link between information and uncertainty was Shannon's fundamental insight.
I was surprised that nobody thought of this before.
There is a person namedJustin Gilmer.
Imagine I flip a coin five times and send the result to you. Five bits of information is required to transmit a coin. It will take less if it is a loaded coin. If the loaded coin lands heads all five times, I will send you a 1 (a single bit of information), which is very likely to happen. The result of a fair coin flip is more surprising than the result of a biased coin flip.
The same thinking applies to the numbers. I could pick two sets at random if I had a family of union closed sets. The elements of each set should be communicated to you by me. A 50% chance that the first element in the first set is a 1 is indicative of the amount of uncertainty surrounding that element.
Gilmer studied information theory when he was a graduate student in mathematics. He was worried that the way he thought to connect information theory to the closed-union conjecture was a naive idea from an amateur.
Gilmer was surprised that no one had thought of this before. I had thought about it for a year, so maybe I shouldn't be surprised.
After finishing his work at Google, Gilmer worked on the problem at night and on weekends. He was encouraged by the ideas that a group of mathematicians had explored a long time ago in an open collaboration. He used a textbook to help him look up formulas he forgot.
Someone who comes up with a great result shouldn't have to consult Chapter 2 of Elements of Information Theory.
If a union- closed family existed, it would be a counterexample to Frankl's theory.
If you choose two sets, A and B, from this family at random, take into account the elements that could be in those sets one at a time. Do you know what the odds are that set A contains the number one? Set A? You wouldn't expect either A or B to contain 1 since every element has a small chance of showing up. If you learn that neither in fact does, there won't be much surprise or information gained.
Think about the chance that the union of A and B has one. It is more likely than the odds of it appearing in either set. The sum of the chances it appears in A and B minus the chances it appears in both. It's possible that it's under 2%.
It is close to a 50% proposition. Sharing the result requires more information. There is more information in the union of two sets than in either of the sets themselves if there is a union- closed family.
It's very clever to reveal things element by element and look at the amount of information you learn. The main idea of the proof is that.
Gilmer was getting closer to Frankl's hypothesis. It is easy to show that the union of two sets has less information than the set itself.
Think about how many different sets you can make from the numbers 1 to 10 in a union- closed family. You will end up with sets with five elements if you pick two of those sets at random. The most common set size is the one that contains five elements. The union you end up with contains about seven elements. There are only 120 ways to make a set.
There is more uncertainty about the contents of two sets than there is about their union. For which there are less possibilities, the union skews to larger sets. When you take the union of two sets in a closed family, you know what you will get, like when you flip a coin, which means the union has less information than the sets it is composed of.
With that, Gilmer was able to prove it. The union is forced to contain more information if no element shows up in 1% of the sets. The union needs to have less information. There must be at least one element in at least a small portion of the sets.
When Gilmer posted his proof on November 16, he included a note that he thought it was possible to use his method to get even closer to a proof.
Three different groups of mathematicians wrote papers within hours of each other that were based on Gilmer's work. The initial burst seems to have taken Gilmer's methods as far as they will go; getting to 50% will probably take additional new ideas.
The authors of the follow up papers wondered why Gilmer didn't just do it himself. The simplest explanation was that Gilmer didn't know how to do some of the technical analytic work needed to pull it off.
Gilmer said that he was rusty and that he was stuck. I wanted to know where the community would take it.
Gilmer believes that the circumstances that left him out of practice made his proof possible.
He said it was the only way he could explain why he thought about the problem for a year in graduate school and didn't make any progress. I don't know how to explain it, I'm in machine learning