Cell Differentiation, GEB and High School Biology

James Somers wrote on his blog:

I wish my high school biology teacher had asked the class how an embryo could possibly differentiate — and then paused to let us really think about it. The whole subject is in the answer to that question. A chemical gradient in the embryonic fluid is enough of a signal to slightly alter the gene expression program of some cells, not others; now the embryo knows “up” from “down”; cells at one end begin producing different proteins than cells at the other, and these, in turn, release more refined chemical signals; …; soon, you have brain cells and foot cells. How come we memorized chemical formulas but didn’t talk about that? It was only in college, when I read Douglas Hofstadter’s Godel, Escher, Bach, that I came to understand cells as recursively self-modifying programs. The language alone was evocative. It suggested that the embryo — DNA making RNA, RNA making protein, protein regulating the transcription of DNA into RNA — was like a small Lisp program, with macros begetting macros begetting macros, the source code containing within it all of the instructions required for life on Earth. Could anything more interesting be imagined?

That’s exactly right, and that’s why I think all school kids should read Gödel, Escher, Bach. I was lucky to buy myself a copy at 16 (I had seen the book mentioned in very different contexts that had not much to do with each other, and that made me curious), and it is fair to say it changed my life.

Some Recommended ICLR 2021 Papers

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data
Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation
Autoregressive Entity Retrieval
Predicting Infectiousness for Proactive Contact Tracing
ICLR 2021 – A Small Selection of Top Papers

These 5 also happen to be among the best-scoring set of 15 papers out of nearly 3,000 (although there are of course many other good papers in the top-15, I have an admitted positive bias towards NLP work as that is what I work on).

Call for Nominations: The Microsoft BCS/BCS IRSG Karen Spärck Jones Award 2020

Closing date (extended): 18 September 2020 (Anywhere on Earth TZ)

               ~ An award to commemorate Karen Spärck Jones ~

A pioneer of information retrieval, the computer science sub-discipline that also underpins the technology of modern Web search engines, Karen Spärck Jones was the Professor of Computers and Information at the University of Cambridge in England. Her contributions to the fields of Natural Language Processing (NLP) and Information Retrieval (IR), especially with regard to experimentation, have been outstanding, highly influential and lasting, and include the introduction of Inverse Document Frequency for relevance ranking. Her achievements resulted in her receiving a number of prestigious accolades such as the BCS Lovelace medal for her advancement of Information Systems, and the ACM Salton Award for her significant, sustained and continuing contributions to research in information retrieval. Karen was also an outspoken advocate for women in computing.

To learn more about Karen and her work, see:

In order to honour Karen’s achievements, the BCS Information Retrieval Specialist Group (BCS IRSG) in conjunction with the BCS has established an annual award to encourage and promote talented researchers who have endeavoured to advance our understanding of Natural Language Processing or Information Retrieval with significant experimental contributions.

To celebrate the commemorative event, the recipient of the 2020 award will be invited to present a keynote lecture at BCS IRSG’s annual conference — the European Conference on Information Retrieval (ECIR) next year. This forum provides an excellent venue to present and announce the award as the conference attracts many new and young researchers.

Eligibility. Open to all NLP/IR researchers, who have no more than 10 years post doctoral or equivalent experience at the closing date for nominations (non-research times, e.g. parental leave or career breaks, will be taken into account).

Criterion. To have endeavoured to advance our understanding of NLP and/or IR through experimentation.

Nominations. The following should be provided:
• Name of nominee, position, affiliation, years since completion of the Ph.D.;
• Name of person proposing the nominee, position, and affiliation;
• Short case for the award, not to exceed 2500 words, highlighting the contributions the individual has made;
• List of the individual’s top five publications reflecting the relevant contributions, and role within these; and
• Exactly two supporting letters from people who would like to encourage/support the nomination.

Nominations should be emailed to the panel chair below. The support letters can be emailed separately by the referees. It is possible for individuals to nominate themselves, in which case they should provide three support letters. Please note, that we anticipate that people who provide support letters will do so only for a single candidate.

Award Panel. The Award Panel Chair, appointed by the BCS IRSG Committee, will invite panel members from amongst representatives of the BCS main council, the BCS IRSG Committee, sponsoring organisation(s), as well as at least two experts appointed by the BCS IRSG committee.

Prize. The recipient of the award will receive a certificate, a trophy, a cash prize of £1000 plus expenses for the awardee to travel to ECIR.

Timeline for the 2020 Award to be presented at ECIR 2021:
18 September 2020 — closing date for nominations (update: has been extended by 2 days);
25 September 2020 — deadline for support letters (update: has been extended by 2 days);
• 16 December 2020 — notification of the prize recipient;
• 28 March-1 April 2021 — recipient presents keynote at ECIR 2021 in Lucca, Italy.

The Karen Spärck Jones Award is sponsored by Microsoft Research Cambridge.

Award Chair: Jochen L. Leidner, Refinitiv Labs and University of Sheffield.

For a list of previous recipients of the award, cf. http://irsg.bcs.org/ksjaward.php

Looking into Rust

Rust is a programming language that was started around 2014 by a Mozilla employee as a private project; its inventor managed to convince the Mozilla foundation to make it an official project, and in recent years, Rust has consistently ranked top as the language most liked by developers. It competes with Go in their joint attempt to de-thrown C/C++ as the standard language for highly performant systems programming.

The reason I got interested in Rust is because it uses strong static types and type inference. Its notation inherits some elements from the functional language ML, which is close to the mathematical notation for functions, and that in turn makes the code easy to read, e.g.

fn calculate_length(s : String) -> (String, u64) {
//.. return a tuple of a string and an unsigned 64-bit integer value

Ownership and Explicit Ownership Transfer

Unlike Java, Python, LISP or Go, Rust doesn’t use garbage collection. Unlike C, it also does not use explicit malloc() / free() calls, which have been difficult for developers to keep track of an a source of bugs, crashes and security vulnerabilities. So how does Rust do it?

Basically, a (non-atomic) object that leaves the scope (function, block) gets released, unless an explicit ownership transfer is demanded. References are excluded from needing ownership to reference an object, as are slices, which are contiguous ranges of container elements:

let s = String::from("hello world");
let hello = &s[0..5];
let world = &s[6..11];

For more detail, consult:
Rust’s compiler can also figure out at compile time when there is a chance of a dangling reference. In the words of the language manual:

“The Rust language gives you control over your memory usage in the same way as other systems programming languages, but having the owner of data automatically clean up that data when the owner goes out of scope means you don’t have to write and debug extra code to get this control.”

First experiments with Rust

Downloading and trying the rustc compiler via the cargo build system command turned out to be easy. Libraries specified as dependencies (“crates”) automatically get pulled from the Rust repository, a far cry from the effort it takes to install/build basic C++ libraries that are not header-only. The Rust compiler’s
error messages are readable, they localize errors well (not hard to do better than GNU g++ on that front) and the use of colour coding distinguishes source code fragments from the error messages proper in human-friendly ways.

The Crux

The litmus tests for a new programming language are stability, community and libraries. Without a stable syntax, serious developers quickly shy away from
investing their time and making a production bet seems to risky. Without a thriving community around a language, the continuity of development tool development, library development and general problem solving are in jeopardy (you want to be coding in something so that you can find the solution to your problem on StackExchange, really). An without available libaries that provide GUI frameworks, logging tools, regex engines, database abstraction layers, CSV readers, vizualization toolkits and other daily needs (some general, some depending on your area) your productivity will be reduced by the distraction of needing “just one day more, I need to quickly implement a hashtable library”.

I may return with a report of my Rust story after gaining a bit more experience, and after finishing reading the manual.

Learning to Rank Tutorial and QuickRank

Franco Maria Nardini, one of the SIGIR’19 Learning to Rank tutorial designers and presenters, kindly shared these links in subsequent communications, which are going to be helpful to anyone interested in learning to rank.

I can highly recommend these, especially the tutorial materials (4) and QickRank (2), which is implemented in ISO C++, compiles beautifully with cmake, and is a pristine example of a well-engineered and easy-to-use software tool.

1) http://rankeval.isti.cnr.it/
2) http://quickrank.isti.cnr.it/
3) http://learningtorank.isti.cnr.it/
4) http://ltr-tutorial-sigir19.isti.cnr.it/