
Month: August 2015
#Summarization via Visualization and Graphs.
Originally shared by Deen Abiola
#Summarization via Visualization and Graphs.
Also, there’s a typo (okay, at least one) in the Iran Agreement (well, in the version posted on medium and as of this writing…).
_G+ note, inferior duplication of medium version posted here: https://medium.com/@sir.deenicus/summarization-via-visualization-and-graphs-4b33454db3d6)_
Ah, this is not part of the order of posting I planned, but…it’s not everyday you get to analyze (and find a trivial mistake) in a government document. Since May, I’ve been writing a really fast, thread safe, fully parallel NLP library because everything else I’ve tried is either too bloated, too slow to run or train, not thread-safe, too academic, too license encumbered or utilizes too much memory.
More pertinently, I’ve also been on a life-long quest to figure out some way to effectively summarize documents. Unfortunately, technology is as yet, too far away for my dream intelligent abstract summarizer—every single one of my apparently clever ideas have been unmasked as impostors and pretenders, always self-annihilating in exasperating puffs of failure. Sigh.
However, I have been able to combine ideas that work efficiently on today’s machines to arrive at a compromise (plenty more on that in the future). One key idea has been representing text at a layer above just strings, think Google’s word2vec but requiring orders of magnitude less computation and data for good results (to be more specific, I use reflective random indexing and directional vectors—which go just a bit beyond bag of words).
Once vectors have been generated (it took my machine 500 ms to do this) and sentences have been tagged with parts of speech, interesting possibilities open up. For example, the magnitude of a vector is an indication of how important a word is, it’s similar to word count but orders words in a way that better reflects a word’s importance (counts, once you remove common stopwords, are actually infuriatingly good at this already—infuriating because it can be hard to come up with something both better and less dumb). It can also work when few words are repeated, so it’s more flexible. Applying this to the Iran document I get as the top 10 most important nouns:
> “iran, iaea, fuel, year, centrifuge, reactor, uranium, enrichment, research, joint”
And for verbs:
> “include, test, verify, modernise, permit, fabricate, redesign, monitor, intend, store”
This is useful and, being able to select a link, press a hot key and get a small window displaying a similar result for any page will, I think, be a useful capability to have in one’s daily information processing toolkit. However, such a summary is limited. One idea is to take the top nouns, find their nearest neighbors but limit them to verbs and adjectives. Here’s what I get:
> “iran: include/produce/keep is…future, subsequent, consistent
>
>year: keep/conduct/initiate is…more, future, consistent
>
> iaea: monitor/verify/permit is…necessary, regular, daily
>
> fuel: fabricate/intend/meet is…non-destructive, ready, international
>
> uranium: seek/enter/intend is…future, natural, initial
>
> reactor: modernise/redesign/support is…iranian, international, light
>
> centrifuge: occur/remain/continue is…single, small, same
>
> production: include/need/produce is…current, future, consistent
>
>use: include/produce/meeting is…subsequent, initial, destructive
>
>arak: modernise/redesign/support is…light, iranian, international
>
>research: modernise/redesign/support is…international, appropriate, light
>
>jcpoa: declare/implement/verify is…necessary, consistent, continuous
Reading this, I see the results are almost interpretable. There’s the IAEA who will monitor Iran and JCPOA too, or something…I’m guessing. There’s lots of emphasis on Iran’s future and modernization, as well as limitations on uranium production and instruments—centrifuges in particular—in use (at this point, I’d like to point out that I’ve absolutely not even looked at the original document and don’t ever plan to). I don’t know if this method will ultimately prove useful; a lot of work involves experimenting with what actually works in day to day use. Some features are simply not worth the cognitive overhead of even just knowing they exist.
It was at this point I decided to graph the result. The basic idea is: connect all the words with the edge weights computed from pairwise cosine similarities but limit connections to be of the type VERB=>NOUN=>VERB, then apply a maximum spanning tree to prune the edges and make it actually readable. The idea being, instead of just grouping words by similarity we impose some grammatical structure then hopefully, we get something a bit more structured.
It was while browsing that graph I found the typo:

I’m fairly certain that “Chennals” is not some fancy Nuclear Engineering jargon.

##Network Examples
I also built a graph using an algorithm utilizing inputs from a phrase chunker, which then tries to build short understandable phrases (verb dominant phrases can only link to noun phrases), another on sentences and another from paragraphs. The gray shaded and golden edge nodes tend to be most important and are worth zooming into. Around those will be all the most similar phrases/sentences/paragraphs.
##Click for: [Single Words Example](http://sir-deenicus.github.io/home/test_single_word_vbnouns.html)

Although this graph visualization was originally meant to compare and contrast (via orthogonal vectors) two or more documents, it works well enough as a summarization tool. In case you’re curious, the graph visualization toolkit I’m using is the excellent vis.js (I welcome any suggestions that’ll improve on the sometimes cluttered layout).
##Click for: [Phrases Example Network](http://sir-deenicus.github.io/home/test_phrases.html)
The Phrases example is clearly more comprehensible than the single word approach but is not without flaws—there are incomplete thoughts and redundancies. On the other hand, we see that similar phrases are grouped together. It’s worth noting that each phrase is represented by a single (200D) vector, hence the groupings are not based on string similarities. And, despite the algorithm not lowcasing all words, the method still groups different cased words together, suggesting that it captures something more than: these words tend to be near each other. It also groups conjugations and phrases in a non-trivial sense, as seen with higher level groupings like:
* produce fuel assemblies/fuel core reloads/fuel will be exhausted/spent fuel
* can be used/future use
Those are not just cherry picked samples, as you can see for yourself in the link above. The method holds generally in all documents I’ve tried. Additionally, it’s worth remembering that nodes aren’t just grouped by similarity but also must meet the very basic noun phrase-ish =>verb phrase-ish structure I mentioned. The goal is to get something sufficiently comprehensible while being non-linear and more exploratory. By zooming in and out and hiding irrelevant nodes, I can go into more or less depth as I please. This, together with basic question answering on arbitrary text form my very basic approximation of non-linear reading/knowledge acquisition. You can think of skimming as a far distant ancestor of this approach.
##[Sentences](http://sir-deenicus.github.io/home/test_sents.html)
[Paragraphs Example](http://sir-deenicus.github.io/home/test_paras.html)
Zooming out is, I’ve found, important when dealing with longer text items (removes clutter). Then, you can click a node, which disapears anything not in its neighborhood, making it easier to read when zoomed in. Other useful features are: the ability to search for a word as well as the ability to hover over nodes to get at their text.

##Text Summaries
Similar to connecting verbs and nouns, I tried connecting augmented noun phrases (very, very simple rule on how to join phrases to maximize coherence and the same for) verb phrases. With that, for the top 5 phrases, I got:
>”2. Iran will modernise the Arak heavy water research reactor to support peaceful nuclear research and radioisotopes production:
to be a multi-purpose research reactor comprising radio-isotope production/to support its peaceful nuclear research and production needs and purposes/to monitor Iran ’s production
>
>Iran ’s uranium isotope separation-related research and development or production activities will be exclusively based:
to any other future uranium conversion facility which Iran might decide to build/to verify the production/to minimise the production
>
>Iran ’s enrichment and enrichment R&D activities are:
to meet the enrichment and enrichment R&D requirements/conducting R&D/to enable future R&D activities
>
>Iran will maintain no more than 1044 IR-1 centrifuge machines:
will use no more than 348 IR-1 centrifuges/are only used to replace failed or damaged centrifuges/balancing these IR-1 centrifuges
>
>Iran will permit the IAEA to implement continuous monitoring:
will permit the IAEA to implement continuous monitoring/will permit the IAEA to verify the inventory/will allow the IAEA to monitor the quantities
This, I think, is actually a pretty decent summary. It’s far from perfect but I’ve got a much better idea of what’s in the document despite it being fairly short. It’s also not a verbatim extractive summarizer (since it’s constructing and combining phrases which incidentally, also ends up compressing sentences. Although…if a proper generalizing summarizer was a human, this would be like the last common ancestor of humans and mice. Or maybe lice. sigh).
Closer to more typical extractive methods is a very simple method I came up with that generates vectors for sentences using RRI. The method takes the largest magnitude sentence and then finds the nearest sentence that get’s within x% of its magnitude (I have x=50%). A sum of all met vectors is kept and a sentence must have > 0.7 similarity with this memory vector. This is repeated for all sentence. I’ve found that this method tends to create far more fluid summaries than is typical for extractive summarizers while working on almost all document types (even doing a fair job on complex papers and Forum threads). For this Agreement, we get the below at 10% the original document length:
———-
##More Fluid Extracted Summary:
“Destructive and non-destructive testing of this fuel including Post-Irradiation-Examination (PIE) will take place in one of the participating countries outside of Iran and that country will work with Iran to license the subsequent fuel fabricated in Iran for the use in the redesigned reactor under IAEA monitoring.
Iran will not produce or test natural uranium pellets, fuel pins or fuel assemblies, which are specifically designed for the support of the originally designed Arak reactor, designated by the IAEA as IR-40. Iran will store under IAEA continuous monitoring all existing natural uranium pellets and IR-40 fuel assemblies until the modernised Arak reactor becomes operational, at which point these natural uranium pellets and IR-40 fuel assemblies will be converted to UNH, or exchanged with an equivalent quantity of natural uranium.
Iran will continue testing of the IR-6 on single centrifuge machines and its intermediate cascades and will commence testing of up to 30 centrifuge machines from one and a half years before the end of year 10. Iran will proceed from single centrifuge machines and small cascades to intermediate cascades in a logical sequence.
Iran will commence, upon start of implementation of the JCPOA, testing of the IR- 8 on single centrifuge machines and its intermediate cascades and will commence the testing of up to 30 centrifuges machines from one and a half years before the end of year 10. Iran will proceed from single centrifuges to small cascades to intermediate cascades in a logical sequence.
In case of future supply of 19.75% enriched uranium oxide (U3O8) for TRR fuel plates fabrication, all scrap oxide and other forms not in plates that cannot be fabricated into TRR fuel plates, containing uranium enriched to between 5% and 20%, will be transferred, based on a commercial transaction, outside of Iran or diluted to an enrichment level of 3.67% or less within 6 months of its production.
Enriched uranium in fabricated fuel assemblies from other sources outside of Iran for use in Iran’s nuclear research and power reactors, including those which will be fabricated outside of Iran for the initial fuel load of the modernised Arak research reactor, which are certified by the fuel supplier and the appropriate Iranian authority to meet international standards, will not count against the 300 kg UF6 stockpile limit.
This Technical Working Group will also, within one year, work to develop objective technical criteria for assessing whether fabricated fuel and its intermediate products can be readily converted to UF6. Enriched uranium in fabricated fuel assemblies and its intermediate products manufactured in Iran and certified to meet international standards, including those for the modernised Arak research reactor, will not count against the 300 kg UF6 stockpile limit provided the Technical Working Group of the Joint Commission approves that such fuel assemblies and their intermediate products cannot be readily reconverted into UF6. This could for instance be achieved through impurities (e.g. burnable poisons or otherwise) contained in fuels or through the fuel being in a chemical form such that direct conversion back to UF6 would be technically difficult without dissolution and purification.
Iran will permit the IAEA to monitor, through agreed measures that will include containment and surveillance measures, for 25 years, that all uranium ore concentrate produced in Iran or obtained from any other source, is transferred to the uranium conversion facility (UCF) in Esfahan or to any other future uranium conversion facility which Iran might decide to build in Iran within this period.
If the absence of undeclared nuclear materials and activities or activities inconsistent with the JCPOA cannot be verified after the implementation of the alternative arrangements agreed by Iran and the IAEA, or if the two sides are unable to reach satisfactory arrangements to verify the absence of undeclared nuclear materials and activities or activities inconsistent with the JCPOA at the specified locations within 14 days of the IAEA’s original request for access, Iran, in consultation with the members of the Joint Commission, would resolve the IAEA’s concerns through necessary means agreed between Iran and the IAEA. ”


Say no to intrusive websites. Literally.
Say no to intrusive websites. Literally.
Originally shared by Emlyn O’Regan
I made a No button for the Web. It’s a chrome extension. If a site annoys you, hit “No”, choose your message, and that’ll be sent in a request to the site (so it can be seen in the logs). Then the tab closes.
Microsoft releases an iOS to Windows bridge.
Microsoft releases an iOS to Windows bridge. This is actually a good idea. There are over a million apps in the iOS app store, nearly none of them are discoverable.
Jeremy Nixon watched the debate so you don’t have to. Good job summarising.

Jeremy Nixon watched the debate so you don’t have to. Good job summarising.
Originally shared by Jeremy Nixon
Debate Notes
The Republican clown car is so large this year that we kicked off the Hunger Games earlier today with the junior varsity squad, the candidates who didn’t make the cut for tonight’s main event. Carly Fiorina, who ran HP into the ground and wants to do the same for the country, easily won the JV match.
The main event is headlined by absurd buffoon Donald Trump, raising the excitement level quite a bit. No one knows what will happen! May the odds be ever in your favor.
The tributes: Donald Trump; Marco Rubio; Jeb Bush; John Kasich; Ted Cruz; Mike Huckabee; Scott Walker; Chris Christe; Ben Carson; Rand Paul. Positioned on the stage by how they’re doing in the polls.
“You guys nervous?” Megyn Kelly asked. That was weird.
§
We kick things off with the Donald Trump question: Is there anyone on stage who is unwilling to pledge your support to the eventual nominee and to not run as an independent candidate? Donald Trump’s hand shoots up. The crowd boos. “I cannot say I have to respect the person, if it’s not me, the person that wins.”
Rand Paul attacks. “He’s already hedging his bet on the Clintons.”
Trump: “I will not make the pledge at this time.” More boos.
§
This opened as more of a shared interview than a debate. Rand Paul went on the attack of a couple of times, but there were few exchanges between the candidates and few followup questions. The moderators instigated more exchanges later, but spontaneous ones were few.
Marco Rubio tried to convince us that he’s up to the job. He had lots of specifics on policy. He points out that the election shouldn’t be a “resumé competition”—If it is, he said, Hilly Clinton wins. (He avoided drinking from any water bottles for the entire debate. He kept it to the commercial breaks, I guess.)
Ted Cruz tried to convince us he’s the straight-talking truth-teller, and that’s what we need. He did this pretty well, disturbingly.
Jeb Bush tried to convince us he’s neither his brother nor his father. “I’m gonna have to earn this,” he said. He defended himself on education very well, in spite of his support for Common Core. He tried to hedge his support for a path to citizenship for illegal immigrants with being tough on illegal immigration.
Scott Walker tried to convince us he hates abortion more than anyone else. “I defunded Planned Parenthood four years ago.” Reminded that 83% of Americans favor an exception to protect the life of the mother, and asked how he could justify being against that exception, he said, “I believe that is an unborn child.”
Rand Paul tried to convince us he’s “a different kind of Republican.” Which he is, but we already knew that, so I’m not sure what the point was.
John Kasich tried to remind us that he exists. He talked about his record a lot, as governor of Ohio, to remind us that he’s governor of Ohio. He told us more than once that his father was a mailman. He actually did very well, getting into specifics on his record.
Mike Huckabee tried to convince us he’s still relevant. But, asked how he would convince enough independents and Democrats to vote for him, despite being so right-wing on social issues, he said the president should invoke the 5th and 14th Amendments to stop abortion. Which is pretty much the opposite of what he was asked.
Ben Carson tried to convince us he belongs on that stage with everyone else. I don’t think he did any such thing. He seemed out of his depth. Asked a specific question, whether he would bring back waterboarding, his answer was that he wouldn’t tell our enemies what we’ll do to get information. “There is no such thing as politically correct war.”
Chris Christie tried to convince us he’s a Republican. He had some strong speeches about 9/11 and reminding us that he was a federal prosecutor putting terrorists in prison, but he didn’t have his breakout moment tonight. He got into a couple of squabbles with Rand Paul over civil liberties, and came down strongly on the side of the NSA doing whatever the heck they want.
And then there was Donald Trump. He kind of owned the night. But Fox’s focus group had a really negative response. “I expected him to rise to the occasion and look presidential, but he didn’t.” And they were not impressed with his refusal to pledge not to run as an independent.
§
Megyn Kelly asked Mr. Trump about speaking his mind “without a politician’s filter,” pointing out that he has called women “fat pigs”, “dogs”… Mr. Trump’s response: “Only Rosie O’Donnell.” Ms. Kelly, to her credit, didn’t let him get away with the joke: “For the record,” she said, “it was well beyond Rosie O’Donnell.” He conceded the point, but said “I think the big problem this country has is being politically correct.” (The crowd cheers!) He doesn’t have time for “total political correctness” and the country doesn’t, either. “What I say is what I say,” he said, and if you don’t like it, “I’m very sorry.”
On Mr. Trump’s assertions that Mexico is sending criminals, rapists, and drug dealers across the border: does he have evidence, as he suggested? First, he said, we wouldn’t even be talking about illegal immigration if it weren’t for him. Somehow, I doubt that; it’s a pretty big issue with Republicans. We need, he said, to build a wall. This led to the first followup of the debate, in which the moderator pressed him on evidence. Mr. Trump said the border patrol told him that’s what’s happening.
He said it’s because “our leaders are stupid.” And the Mexican government is much smarter, and they send the bad ones over, because they don’t want to deal with those people. “Many killings, murders” are being done by illegal immigrants, he said—a questionable assertion at best. That’s what’s happening, though, he said, whether you like it or not.
Obamacare, Mr. Trump said, is a “complete disaster.” But, what about his support fifteen years ago for a single-payer system? He said he was the only one in 2004 who was against the war in Iraq, that he had the “vision” to say that. As for single-payer, it works great in Canada and Scotland, and it could have worked here back when he supported it. Now, though, it wouldn’t; he doesn’t say why not. Now, he wants to allow insurance companies to sell across state lines.
At this point, Rand Paul interrupts to say that Mr. Trump is “on the wrong side of this” for his support for single-payer. Mr. Trump points out that he said he doesn’t support single-payer at the moment. “I don’t think you heard me. You’re having a hard time tonight.”
As the debate went on, Mr. Trump was asked about the four times his companies have declared bankruptcy. Lenders, the moderator said, lost billions. Mr. Trump’s defense was that out of hundreds of deals, four times he used the laws of the country to his advantage, just like many other people have.
The moderator followed up: Trump Entertainment Resorts went bankrupt in 2009, and lenders lost more than a billion dollars. “First of all,” Mr. Trump replied, “these lenders aren’t babies. These are total killers. These are not the nice, sweet little people that you’d think. You’re living in a world of the make-believe.” He said he had the “good sense to leave Atlantic City…before it totally cratered. I’m very proud of it.” Maybe he can do the same for the country?
Another great Trump Moment: He was asked about his donations to Democrats, including Hillary Clinton and Nancy Pelosi, and his statement that he did it to get business favors. What did he get in return for those donations? “I give to many people, I give to everybody. When they call, I give; when I call them, they are there for me. That’s a broken system.” Money in politics? From a Republican? Be still my beating heart! “With Hillary Clinton? I said, ‘Be at my wedding,’ and she came to my wedding. You know why? She had no choice, because I gave.”
You were pro-choice, Megyn Kelly point out. You supported an assault weapons ban. When did you actually become a Republican? “I’ve evolved on many issues over the years. I’m pro-life now.”
In the end, Donald Trump was the big story of the debate, but maybe not in a good way. He spoke far more than any other candidate, but I don’t think he did himself any favors. Fox’s focus group hated him, and many of them came in as fans. Saying he won’t rule out running as an independent hurt him a lot.
Every time the other candidates were prompted to attack Mr. Trump, they passed. I think every one of them saw that The Donald was his own worst enemy tonight and didn’t want to get their hands dirty.
§
Then there were the other candidates. And that’s how it was: the others just got through the thing.
Chris Christie: “We have a lot of work to do in NJ, but I’m darn proud of how we brought our state back.” Blah, blah. He didn’t get anywhere. He tried to play the 9/11 card so much I thought he was channeling Rudy Giuliani.
Mike Huckabee and Ted Cruz kept their heads down but did well. I think, bizarrely, you’ll see a bump for Mr. Huckabee after this debate. Ted Cruz is probably happy right now, as well. Huckabee: “The purpose of the military is to kill people and break things.” Okay.
Marco Rubio failed to distinguish himself. At one point, though, Megyn Kelly asked him how he justifies his support for a rape/incest exception for abortion. His response was that he has never advocated that.
What Ms. Kelly seemed to be talking about was that Sen. Rubio introduced a bill that included that exception. In fairness to the senator, that doesn’t necessarily mean he supports that exception—just that he was trying to get something passed.
Scott Walker has an abortion problem. His exchange with Ms. Kelly about letting a woman die rather than aborting a problematic pregnancy is going to be played as a sound bite for quite some time.
On the Iran deal, Gov. Walker remembers “tying a yellow ribbon around a tree in front of my house.” Because, you know, things haven’t changed since 1979.
Rand Paul was Rand Paul. Playing the attack dog to try to make his points. “Get a warrant!” he shouted at Chris Christie on the NSA surveillance question. But he seemed bitter and angry. His best line: “I don’t want my marriage or my guns registered in Washington.” He also said he would cut foreign aid because we can’t afford it—but foreign aid is less than 1% of the federal budget.
Ben Carson is in over his head.
Jeb Bush still sounds like Yet Another Bush.
§
The clown car has a long road ahead of it, folks. Buckle up.
Rick Santorum

Rick Santorum
Definition of “santorum”
frothy mixture of lube and fecal matter
Look it up.
Via God Emperor Lionel Lauer
Originally shared by Frederick Wright
Pretty much what all the GOP candidates were doing, to themselves, each other, and their gawping, credulous, mouth-breathing supporters.
America, Y U still use voting machines?
America, Y U still use voting machines?
Originally shared by ****
Finally, voter fraud. But this won’t be fixable by voterid.
Don’t get in the way unless you want 400 volts up your arse.

Don’t get in the way unless you want 400 volts up your arse.
Originally shared by Gideon Rosenblatt
Tesla’s robotic snake automatically moves out from the wall and plugs itself in to charge your car. Musk told us this was coming late last year. Here it is.
#tesla #robotics
It was a monstrous crime.
It was a monstrous crime.
Originally shared by Mark Peterson
Constructing new views with a neural network trained on real world photographs.
Constructing new views with a neural network trained on real world photographs.
DeepStereo: Learning to Predict New Views from the World’s Imagery
Deep networks have recently enjoyed enormous success when applied to recognition and classification problems in computer vision, but their use in graphics problems has been limited. In this work, we present a novel deep architecture that performs new view synthesis directly from pixels, trained from a large number of posed image sets. In contrast to traditional approaches which consist of multiple complex stages of processing, each of which require careful tuning and can fail in unexpected ways, our system is trained end-to-end. The pixels from neighboring views of a scene are presented to the network which then directly produces the pixels of the unseen view. The benefits of our approach include generality (we only require posed image sets and can easily apply our method to different domains), and high quality results on traditionally difficult scenes. We believe this is due to the end-to-end nature of our system which is able to plausibly generate pixels according to color, depth, and texture priors learnt automatically from the training data. To verify our method we show that it can convincingly reproduce known test views from nearby imagery. Additionally we show images rendered from novel viewpoints. To our knowledge, our work is the first to apply deep learning to the problem of new view synthesis from sets of real-world, natural imagery.