Microsoft-BCS/BCS-IRSG Karen Spärck Jones Award 2020

I am happy to announce that the winner of the 2020 Microsoft-BCS/BCS IRSG Karen Spärck Jones award (to be presented at ECIR 2021 next year) is Dr. Ahmed Hassan Awadallah (Principal Research Manager at Microsoft AI Research in Redmond, WA, USA).

Ahmed has accepted the award. He will give a talk at ECIR 2021 (originally in Lucca, now online only).

I would like to thank the eight independent judges for their valued contributions. 

Cell Differentiation, GEB and High School Biology

James Somers wrote on his blog:

I wish my high school biology teacher had asked the class how an embryo could possibly differentiate — and then paused to let us really think about it. The whole subject is in the answer to that question. A chemical gradient in the embryonic fluid is enough of a signal to slightly alter the gene expression program of some cells, not others; now the embryo knows “up” from “down”; cells at one end begin producing different proteins than cells at the other, and these, in turn, release more refined chemical signals; …; soon, you have brain cells and foot cells. How come we memorized chemical formulas but didn’t talk about that? It was only in college, when I read Douglas Hofstadter’s Godel, Escher, Bach, that I came to understand cells as recursively self-modifying programs. The language alone was evocative. It suggested that the embryo — DNA making RNA, RNA making protein, protein regulating the transcription of DNA into RNA — was like a small Lisp program, with macros begetting macros begetting macros, the source code containing within it all of the instructions required for life on Earth. Could anything more interesting be imagined?

That’s exactly right, and that’s why I think all school kids should read Gödel, Escher, Bach. I was lucky to buy myself a copy at 16 (I had seen the book mentioned in very different contexts that had not much to do with each other, and that made me curious), and it is fair to say it changed my life.

Some Recommended ICLR 2021 Papers

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data
Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation
Autoregressive Entity Retrieval
Predicting Infectiousness for Proactive Contact Tracing
ICLR 2021 – A Small Selection of Top Papers

These 5 also happen to be among the best-scoring set of 15 papers out of nearly 3,000 (although there are of course many other good papers in the top-15, I have an admitted positive bias towards NLP work as that is what I work on).

Call for Nominations: The Microsoft BCS/BCS IRSG Karen Spärck Jones Award 2020

Closing date (extended): 18 September 2020 (Anywhere on Earth TZ)

               ~ An award to commemorate Karen Spärck Jones ~

A pioneer of information retrieval, the computer science sub-discipline that also underpins the technology of modern Web search engines, Karen Spärck Jones was the Professor of Computers and Information at the University of Cambridge in England. Her contributions to the fields of Natural Language Processing (NLP) and Information Retrieval (IR), especially with regard to experimentation, have been outstanding, highly influential and lasting, and include the introduction of Inverse Document Frequency for relevance ranking. Her achievements resulted in her receiving a number of prestigious accolades such as the BCS Lovelace medal for her advancement of Information Systems, and the ACM Salton Award for her significant, sustained and continuing contributions to research in information retrieval. Karen was also an outspoken advocate for women in computing.

To learn more about Karen and her work, see:

In order to honour Karen’s achievements, the BCS Information Retrieval Specialist Group (BCS IRSG) in conjunction with the BCS has established an annual award to encourage and promote talented researchers who have endeavoured to advance our understanding of Natural Language Processing or Information Retrieval with significant experimental contributions.

To celebrate the commemorative event, the recipient of the 2020 award will be invited to present a keynote lecture at BCS IRSG’s annual conference — the European Conference on Information Retrieval (ECIR) next year. This forum provides an excellent venue to present and announce the award as the conference attracts many new and young researchers.

Eligibility. Open to all NLP/IR researchers, who have no more than 10 years post doctoral or equivalent experience at the closing date for nominations (non-research times, e.g. parental leave or career breaks, will be taken into account).

Criterion. To have endeavoured to advance our understanding of NLP and/or IR through experimentation.

Nominations. The following should be provided:
• Name of nominee, position, affiliation, years since completion of the Ph.D.;
• Name of person proposing the nominee, position, and affiliation;
• Short case for the award, not to exceed 2500 words, highlighting the contributions the individual has made;
• List of the individual’s top five publications reflecting the relevant contributions, and role within these; and
• Exactly two supporting letters from people who would like to encourage/support the nomination.

Nominations should be emailed to the panel chair below. The support letters can be emailed separately by the referees. It is possible for individuals to nominate themselves, in which case they should provide three support letters. Please note, that we anticipate that people who provide support letters will do so only for a single candidate.

Award Panel. The Award Panel Chair, appointed by the BCS IRSG Committee, will invite panel members from amongst representatives of the BCS main council, the BCS IRSG Committee, sponsoring organisation(s), as well as at least two experts appointed by the BCS IRSG committee.

Prize. The recipient of the award will receive a certificate, a trophy, a cash prize of £1000 plus expenses for the awardee to travel to ECIR.

Timeline for the 2020 Award to be presented at ECIR 2021:
18 September 2020 — closing date for nominations (update: has been extended by 2 days);
25 September 2020 — deadline for support letters (update: has been extended by 2 days);
• 16 December 2020 — notification of the prize recipient;
• 28 March-1 April 2021 — recipient presents keynote at ECIR 2021 in Lucca, Italy.

The Karen Spärck Jones Award is sponsored by Microsoft Research Cambridge.

Award Chair: Jochen L. Leidner, Refinitiv Labs and University of Sheffield.

For a list of previous recipients of the award, cf.

Ubuntu 20.04 Security Vulnerability

Ubuntu Linux 20.04 LTS has made login passwords displayable with a button in the way WiFi passwords usually are. While this may have some utility, it presupposes the cleartext form is stored somewhere, which could be a vulnerability or at the very least could be said to increase the attack surface of the system. I think for WiFi, the cost/risk benefit is okay; for user and root passwords, however, I think the risk by far outweighs the benefits.

Introduction to Financial Markets – Reading Guide

Now and again, people ask me where to start if they would like to acquire knowledge about financial markets. So I have put together a little initial reading list.


Larry Harris (2012), Trading and Exchanges: Microstructure for Practitioners

Investment & Advisory

Glen Arnold (2014), FT Guide to Banking

Frank J. Fabozzi and Harry M. Markowitz (2011), The Theory and Practice of Investment Management: Asset Allocation, Valuation, Portfolio Construction, and Strategies

Giuliano Iannotta (2014), Investment Banking: A Guide to Underwriting and Advisory Services Paperback


Sébastien Billot (2020), Financial Crime Compliance: Identify and Mitigate Financial Crime Risks

Wealth Management

Charlotte B. Beyer (2014), Wealth Management Unwrapped

CoViD-19: Some Surprises

In this post, I’d like to point out a few observations that have surprised me during the current pandemic.

Country behaviors. A pandemic may require responses that are more authoritarian than a society’s normal operations, and this in itself is a controversial topic. But if we accept it for the moment then what could be observed is two processes were at play in parallel in our universe: the official authorities e.g. the European Union, the British government the US Federal government at the top made announcements but it was lower-level authorities that were actually responsible for much of the day-to-day rules, and the inconsistent messaging kept confusing people. Furthermore, while there were a few acts of charity (like Romanian medical staff flying to Italy to help, or German hospitals taking in patients from France and Italy), overall people were quite country-focused. At the same time, each country’s population (and media) was keenly looking at others’ performance as a way to “benchmark” (for lack of a better term) one’s own government’s performance. This had become possible due to the Internet as a global communication enabler. Unlike a war, a pandemic attacks all of humanity in a globally connected world, so one would have hoped countries to work together to speed up the extinction of the disease.

Organizational behavior. Many companies finally switched to online work. This should have happened 15-20 years ago, but better late than never. A group of people that kept business with business flights to visit colleagues in the very same company that are just located on another continent to me has always been the biggest waste of money, at the same time creating huge environmental damage. It is refreshing how unproblematic this shift was, how quickly everything could be implemented (given that there was zero preparation), and how effective things have been running, at least in businesses that are suitable for online work. The losers were schools and government administrations: those nations talking about “one laptop per child” in developing countries were often unable to organize their own pupils. In London, the first architecture office with 50 staff has reportedly canceled their office lease, not because of financial struggles from the pandemic but responding to the insight that an office is not needed any longer (given the cost of London-based office space, that’s no surprise). I would not be surprised if in the future more companies were “mostly virtual”, with occasional meetings in physical spaces rented on demand by the hour or day to stay connected on a personal level. Companies will soon turn their attention towards recovery, and leave the pandemic memory behind. But there have been 60 pandemics from 2000-2020, so one would expect some kind of institutional learning to happen in advanced organizations (CMM level 5?).

People’s behavior. People’s personal believes and the degrees of adherence to official guidance (or mandatory rules) is interesting to observe. Generally, as is perhaps expected, earthlings are ill-equipped cognitively to deal with abstract concepts and tiny viruses invisible to the eye. So what happened is people started to take the pandemic more and more seriously as soon as someone in their personal environment was affected, but no sooner. Actual behavior often differed from projected behavior, as evidenced by various senior scientists, advisors, or ministers that were caught and reported in the media to be in violation of rules they themselves promoted. Different ethical value systems also shone through, e.g. whether trading lives against business losses was seriously being considered.

Scientist’s responses. Scientists disagree with each other, and that’s fine – at least when they among each other. What is not fine is to present only one view when communicating with external (non-scientist) audiences, as this creates a misconception of consensus. On the side of public health policies, I am stunned that no-one has forcibly argued for more alignment and standardization in the counting of the infected and dead across countries. If enough information is collected for each case, governments could easily tally up counts in more than one way, which renders invalid the argument that a particular standard would not meet a country’s internal requirements or not appropriately address its needs. Even more stunning is that no strong voices have been speaking out in favor of recurring, national/regional random sampling for CoViD-19 testing with the aim of getting an unbiased view of the pandemic’s spread. Instead, debates based on data-sets known to be heavily biased were fought, and attacked as invalid, but without attempting to implement proposals to fix it.

The source of the pandemic itself. The SARS-CoViD-2 virus and the pandemic it caused (CoViD-19) are remarkable in that the virus is not very deadly, at least in relative terms when compared e.g. to Ebola, yet it caused havoc at unanticipated scale. It turns out that one of the “success” factors of the little (30 kB of information) coronavirus is exactly that it does not kill people quickly, but lets them pass on the disease to many other individuals before symptoms get very strong.

SARS-Covid-2: A Crude Back-of-the-Envelope Estimation of Deaths

Disclaimer: I am not a medic, and not a pandemic modeling researcher. But I am a computer science researcher that has made models of various kinds since the 1990s, many published and sold, and I do have a background as a former Red Cross paramedic (yes, I know how to convert a hospital in case of an Ebola outbreak and such, and I have intubated/resuscitated folks).

This post is a response to various other models that I’ve seen and found too complicated. A complex model while we do not know much instils a lack of credibility in me.

Here is a very crude (back-of-the-envelope) calculation of the overall estimated deaths per country for two countries that I know a bit better and have been following online and offline since December.

The numbers are covering the full Corona pandemic period (not just up to a certain date). The forecast:

  • United Kingdom: between 66,500 and 798,000 deaths
  • Germany: between 8,000 and 96,000 deaths

This “model” is based on the following assumptions:

  • We don’t really know a lot, so we need wide confidence margins. Don’t believe anyone who gives you one number.
  • Because of our lack of knowledge about the disease, as tempting as it may be to run a simulation, I don’t feel comfortable with that approach, as it suggests “more science” than we have.
  • The % of population eventually infected is: 10%-60% (taken from expert statements)
  • The % of exitus letalis outcomes (% infected eventually dying from or in connection with SARS-Covid-2) is: 10%-20% (my own observation from JHU: 9%-22%, rounded to 10% best case and 20% worst case, thankfully at the time of writing we’re now down from 22% to 17%)
  • Country populations:
    Germany: 80 million
    UK: 66.5 million
  • Response effectiveness OoM: Germany: 10E-2; UK: 10E-1, the order of magnitude difference to a “do nothing” approach (which would treated as a 10E-0 multiplier) based on my observations.
  • Note there are absolutely no assumptions made about the actual duration by design – the above is a pure “part of the pie” computation.
  • Existing knowledge (model should be consistent with these):
    UK: at least 30k dead as of May 8
    Germany: at least 8k dead as of May 8

I will compare these numbers against body counts on 2021-05-05. If the model is good, the total numbers of deaths (hospital and otherwise) for the two countries will lie in the two interval brackets provided.

Potential future work includes:

  • apply to other countries;
  • refine the “response effectiveness multiplier” based on a set of critical policy elements being present or not in a country;
  • provide (separate) forecasts for the duration of the pandemic and the financial impact.

Looking into Rust

Rust is a programming language that was started around 2014 by a Mozilla employee as a private project; its inventor managed to convince the Mozilla foundation to make it an official project, and in recent years, Rust has consistently ranked top as the language most liked by developers. It competes with Go in their joint attempt to de-thrown C/C++ as the standard language for highly performant systems programming.

The reason I got interested in Rust is because it uses strong static types and type inference. Its notation inherits some elements from the functional language ML, which is close to the mathematical notation for functions, and that in turn makes the code easy to read, e.g.

fn calculate_length(s : String) -> (String, u64) {
//.. return a tuple of a string and an unsigned 64-bit integer value

Ownership and Explicit Ownership Transfer

Unlike Java, Python, LISP or Go, Rust doesn’t use garbage collection. Unlike C, it also does not use explicit malloc() / free() calls, which have been difficult for developers to keep track of an a source of bugs, crashes and security vulnerabilities. So how does Rust do it?

Basically, a (non-atomic) object that leaves the scope (function, block) gets released, unless an explicit ownership transfer is demanded. References are excluded from needing ownership to reference an object, as are slices, which are contiguous ranges of container elements:

let s = String::from("hello world");
let hello = &s[0..5];
let world = &s[6..11];

For more detail, consult:
Rust’s compiler can also figure out at compile time when there is a chance of a dangling reference. In the words of the language manual:

“The Rust language gives you control over your memory usage in the same way as other systems programming languages, but having the owner of data automatically clean up that data when the owner goes out of scope means you don’t have to write and debug extra code to get this control.”

First experiments with Rust

Downloading and trying the rustc compiler via the cargo build system command turned out to be easy. Libraries specified as dependencies (“crates”) automatically get pulled from the Rust repository, a far cry from the effort it takes to install/build basic C++ libraries that are not header-only. The Rust compiler’s
error messages are readable, they localize errors well (not hard to do better than GNU g++ on that front) and the use of colour coding distinguishes source code fragments from the error messages proper in human-friendly ways.

The Crux

The litmus tests for a new programming language are stability, community and libraries. Without a stable syntax, serious developers quickly shy away from
investing their time and making a production bet seems to risky. Without a thriving community around a language, the continuity of development tool development, library development and general problem solving are in jeopardy (you want to be coding in something so that you can find the solution to your problem on StackExchange, really). An without available libaries that provide GUI frameworks, logging tools, regex engines, database abstraction layers, CSV readers, vizualization toolkits and other daily needs (some general, some depending on your area) your productivity will be reduced by the distraction of needing “just one day more, I need to quickly implement a hashtable library”.

I may return with a report of my Rust story after gaining a bit more experience, and after finishing reading the manual.

Report from ECIR 2020 (online)

This post summarizes some impressions from my virtual attendance of ECIR 2020, which was convened as an online-only conference rather than in Lisbon due to the 2019/2020 Coronavirus pandemic.


I set up three machines with monitors side by side and one laptop with one full-screen sublime window open to take notes. Also important is a bottle of water to drink and some chocolate within easy reach, as a four-day sitting marathon doing binge technical talk viewing may be challenging – not something I have done before. Zoom lets you sign on multiple times into different rooms or even the same room, and I also had windows with Slack open (chat) as well
as the conference time-table. [I actually discovered some of this set-up only during Wednesday/day 2, as I was still an online conferencing newbie on Tuesday, busy with running our tutorial.]
Interestingly, I ended up not using the proceedings much, but if you are interested they are free online for one year (e.g. Proc. ECIR Vol. 1: LNCS 12035, ).


The conference has been growing for a while and is still growing. Topic areas include deep learning, entities in search, evaluation,
recommendation, information extraction, retrieval, multimedia, queries, question answering, bias, reproducability, multilinguality. From all these areas, deep learning constituted the largest (perceived) body of contributions, and I would say the work on replication the most unique and among the most exciting; IR research is based not on measuring laws of nature but on assessing methods embodied as software artifacts, which incorporate a plentitude of important decisions — therefore, reproducting and replicating past work is even more important than in the natural sciences.
Nine papers were selected for IR Journal publication (Springer).

Three tutorials were offered, “Principle-to-Program: Neural Methods for Similar Question Retrieval in Online Communities”, “Text Meets Space: Geographic Content Extraction, Resolution and Information Retrieval” and “The Role of Entity Repositories in Information Retrieval”.

The four Workshops were the “International Workshop on Algorithmic Bias in Search and Recommendation (Bias 2020)”, “Bibliometric-Enhanced Information Retrieval [10th Anniversary Workshop Edition]” the “3rd International Workshop on Narrative Extraction from Texts: Text2Story” and the “First International
Workshop on Semantic Indexing and Information Retrieval for Health from Heterogeneous Content Types and Languages (SIIRH)”.

According to the organizers, “457 submissions were fielded across all tracks from 57 different countries” (55+46+10+8+12): 55 long papers (26% acceptance rate), 46 short papers (28% acceptance rate), 10 demonstration papers (30% acceptance rate), 8 reproducibility papers (38% acceptance rate) and 12 invited CLEF papers were presented.


My collaborators Ross Purves (Zurich), Katie McDonough (Turing Institute), Bruno Martins (Lisbon) and I ran a tutorial entitled Text Meets Space, which was a four-hour event stretched out over a full day to make space for breaks as well as opening keynotes. Tutorials at ECIR 2020 were not recorded, unlike the main conference, and I did not mind the least as it was the first time for us running this (with the benefit of hindsight, things worked very well).

Check out our slides and materials on GitHub

In the keynote slot, I attended Vanessa Murdoch’s talk on doing science in an industry setting, which was very insightful as she has seen different environments at Yahoo, Microsoft and now Amazon. Being the co-organizer of one event means that I could not sneak out and also attend some of the great events in parallel, like the workshops or the other two tutorials. The parallel bibliographic IR workshop had its 10th anniversary and it featured an interesting line-up; sadly, I later learned they sadly got “Zoom-bombed” (through a combination of human error and poor security design of Zoom: it puts passwords in URLs!) and experienced e-harrassment and vitriol by hooligans defacing their screen; as a lesson for organizers: lock your Zoom rooms, don’t share links including passwords, or avoid Zoom altogether.


Udo Kruschwitz (virtually) handed out the Microsoft-BCS Karen Spark Jones Award 2019 to Chirag Shah from the University of Washington, who gave the award lecture “Task-Based Intelligent Retrieval and Recommendation”. Chirag has tirelessly taken the position in his research that any search activity ought to be seen in the context of a specific task that it is part of.
Chirag’s talk also pointed out the importance of making people
aware of what they do not know that they do not know (he created
the term “Information Fostering” for system behavior that exploits user knowledge to improve that kind of awareness). Microsoft have kindly extended the award budget, and Udo and the rest of the BCS-IRSG committee have handed over the honour of chairing the award committee to me, so if you have a raising star researcher in your lab that is within 10 years of their Ph.D., consider nominating them!

Fard, Thonet and Gaussier introduced a minimally-supervised deep learning based method for clustering where cluster formation is informed by seeds. The method is applied to 5 data-sets, including Reuters 31578 (there was one other paper that used Reuters data — RCV1/2 — also by Grenoble researchers, Doinychko and Amini). Meng et al’s ReadNet is a neural model for readability scoring (featuring a nice synopsis of past work on p.37). Rebuffel et al., an interdisciplinary team of Sorbonne, BNP Paribas and Criteo researchers, presented a transformer-based model for NLG from structured data, developed as part of the H2020 AI4EU project. Successful academic-industry collaboration was a recurring topic this year, with several KTP projects, EU projects and commercially-funded collaborations presenting their successful output. Kato et al. is a very relevant paper for the finance sector that is interested in company named entities. It proposes models to score entities by various criteria, e.g. a country may be assessed by its crime rate, inter alia (such entity ranking has been part of INEX and TREC). Gerritse, Hasibi and de Vries’ paper on entity ranking can likewise be
applied to companies. Giannopoulos, Kaffes and Kostoulas’ paper “Learning Advanced Similarities and Training Features for Topponym Interlinking”. Give a pair t1, t2 of place names,
do they correspond to the same real-world spatial entity? This is of
course location named entity disambiguation without full resolution, and as such an alternative may be to just resolve toponyms and then check for equality of the resulting spatial footprints. Their approach is to define a “meta-similarity” (“LGM-Sim”). Saier and Farber model citation contexts in order to improve recommendations
for scientific papers by including claim evidence. Sikka et al. presented a predictive model to estimate a piece of code’s
asymptotic complexity, of course an impossible task (still, F1=.65).
Brochier et al. (U Lyon)’s “Inductive Document Network Embedding
with Topic-Word Attention” introduces “Topic-Word Attention” (TWA), a concept for the interplay between word and topic representations, a bit of progress on the topic model front (see example output p.338). Lu, Du and Nie’s VCGN-BERT extends classic BERT with graph convolutional networks for better text classification (p. 405 ff.) through incorporating global and local information about the vocabulary. Camara and Hauff is an important paper, as it shows how and how much BERT can contribute to core retrieval, and how to analyze this using the framework proposed by Rennings et al. last year (Fang’s “retrieval heuristics” + “diagnostic datasets”).

I liked the ECIR programme for its variety, which ranged from Arabic applications to medical retrieval, from disaster tweets to style transfer in NMT, and submissions from 57 countries, including several acceptances with authors from the US, China, India and South Korea elevate ECIR’s standing as a not-just-European conference.


Jimmy Lin’s reproducibility talk tried to get retrieval system setups
to work again after several years, and that was a great example of
technical debt and software ‘rot’ (‘evolution of the system environment’, as some might call it). He pointed out that from Robertson et al. (1994)’s BM25 model to Lucene’s adoption of the same (2015) it took the community 21 years – TWO DECADES! – and he raised the very valid question how we may be able to expedite tech transfer between R&D labs and open source packages that were not prima facie spun out from the former.


Unni Krishnan presented a collaboration between Microsoft and academic researchers on creating plausible but synthetic query logs in order to enable data sharing for open research on query auto-completion research and
development. To apply query auto-complete algorithms, and therefore, to do research in this space, what is needed is a set of partial query strings that represent states in a query entry process by a user, e.g.

kung f
kung fu
kung fu panda

Notably, the paper introduces the notion of a surrogate log based on
abstract(ed) QAC logs <4, 2, 9> (a tuple comprising the length of the original query words): matching these signatures leads to finding corresponding target signatures in a synthetic target corpus for a seed signature computed from the unsharable source log. The method proposed covers sampling queries from seeds to accomodate empirically found power law distributions, a language model to find good replacement and substitution words that are used to emulate user typos and a comparison strategy. The authors demonstrated the efficacy of their proposal on 3 data-sets: Wiki-Clickstream and two 2018 logs from the Microsoft Bing Web search engine. The evaluation shows the power law property of the natural logs are retained, and that Heap’s law, N-gram frequency, empirical entropy
are consistent with the non-synthetic logs (another paper, Jaiswal, Liu and Frommholz, also tackles auto-complete, but of image queries). Krishnan et al.’s paper won the Signal Industry Award because it enabled research that would otherwise not happen or be hidden from the open scientific process. Query auto-complete is a practical problem of many information access systems, and making plausible, even synthetic, data-sets available that can be shared builds a bridge between companies and academic researchers to evaluate their methods on a common reference, whilst protecting companies that would like to share log data from privacy issues as they also have to protect their users. It is interesting that this work came out of SocialNUI, the Microsoft Research Centre for Social Natural User Interfaces, a partnership between Microsoft
Research, Microsoft Australia, the University of Melbourne and the Victorian State Government between 2014-2018, hosted by the Interaction Design Lab group in the School of Computing and Information Systems at the University of Melbourne, which makes it a success story in academic-industry collaboration.
[Disclosure: I was on the ECIR 2020 Signal Industry Award selection committee.]

Zhong et al. presented work on mathematical formula search done by the CiteSeerX team (there were also papers on finding tables and chemical compounds). Uprety et al. model users’ decision making, in particular uncertainty pertaining to it, using the physics behind quantum physics. Witschel, Riesen and Grether present a question answering system over knowledge graphs: questions are translated into Cypher, an open-source graph query language (p.789).

Wang et al. is a great paper on searching news archives, in particular how to answer event questions for which the temporal dimension must be modelled to do well.

The reproducibility track was dominated by Jimmy Lin’s group, which presented several papers there. My favorite, Lin and Zhang, re-ran old experiments using popular IR engines, and found out that due to technical debt of the platforms, often systems could no longer be executed after a few years; interestingly, this didn’t apply to the Terrier platform, as it is written in Java, whose virtual machine insulates the software from the (changing) external environment. (The reproducibility track at ECIR is always worth stopping by.)

Ghanem et al. presented a method to detect irony in several language with F1 between 74%-80%. Hashemi et al. presented ANTIQUE, an open-domain question answering test collection (|Q|=2,626; |Rel|=34k). They also presented some baseline benchmark results for open-domain QA models (vol. II, p.171), e.g. BERT P@1=70.92. Ishigaki et al. presented a new neural abstractive summarization model, which is query-informed (vol. II, p.210). Machida et al.,
another paper on summarization, is extractive and uses question-answer pairs (vol. II, p.291).

Once in a while you can find a really clever idea explored in a paper as
Papadakos and Kalipolitis did in my opinion with their study of
antonyms in Web queries (vol. II, p.356): using query pairs containing antonyms like “capitalism and war” versus “capitalism and peace” they explore how considering/bringing in the “opposite” query can help systems help users in getting a better idea of the conceptual result space.

Sanchez et al. is an interesting application, namely keeping tabs on
the evolution of legislation. The team of authors from Signal AI and UCL use a combination of learning to rank and BERT (part II, p.372).
Researchers at Bloomberg London presented “Identifying Notable News Stories” (Saravanou1, Stefanoni and Meij), a paper in which new stories are compared to past (known) notable events. Devezas and Nunes gave an online demo of Army ANT, a software to conveniently conduct IR experiments that is essentially a Python wrapper around collections, indices and retrieval evaluations. Froebe et al. is a demo of a search engine for German police reports, where a news story about a crime (its URL) is the query and the top-k police
reports are retrieved to support the veracity of the story (I also learned from the paper that there was a hostage situation in a cinema I used to go to in 2016, which had eluded me completely due to my expatriate existence). Martins and Mourão’s system Revisionista.PT tracks post-publication edits to keep track of 140k articles by 12 Portuguese news outlets.

At the very interesting Industry Day, Pedro Noguriera gave a talk about Farfetch on faceted search in the fashion industry (I should say Levi Strauss was a sponsor of ECIR 2020, so fashion definitely met retrieval this year). Farfetch provides a global high-end fashion search to access luxury fashion to people elsewhere. 4,500+ staff, average spent per order $608 (!). A high-end, high-margin niche market play, where trends change fast, but they seem to be doing rather well.

Ashlee Edwards a scientist at Netflix presented some of her research and
product innovation work in the areas of consumer insight, design and product
management. Her background includes studies on stress caused by software,
subtitle usability for blind viewers of movies online.

Overall, I found attending the conference very useful, but not quite fun. Four days
of technical material is challenging to endure in a strange place with a jet-lag,
but sitting through it alone at home in your room frankly makes one’s back hurt,
and is even more lonely than working from home, where at least you interact
more virtually. I was looking forward to this event as much as to Lisbon
itself, and no Webinar can make up for consuming pastéis de nata in the
conference break and meeting your Portugese friends again over a dinner with
Fado in a down-to-earth restaurant with Lisbon’s signature blue-white tiled
walls. Normally, one returns home energized and with a set of photos, but a
virtual event does not give you much back emotionally: for example, it is hard
or close to impossible to meet new people at a virtual conference (who would
just double click on someone’s name on Slack to talk to a complete stranger?),
so I feel for all those student for whom this may have the first conference,
that they did not get the rite of passage that comes with attending your first
one and giving that first ever talk. I liked Tuesday and Friday best, because
the interactive tutorial presentations were engaging and the industry day was
lighter and therefore perhaps easier to sit through remotely.
For completeness’ sake, I should also point out that online-only
conferences of course have advantages: they might be easier to attend for
students struggling with visa issues or small travel budgets, especially if
they are based far from Europe. The environment is less burdened with flights
and attendees of presentations may join that were not planning to attend the
physical conference (possible because the organizers opened up the event to
the world, which increases the exposure for the work published).
The ECIR organizers deserve a big thanks as they rescued our ECIR 2020
conference by turning it into an online event spontaneously. Organizing any
event is hard enough, but they did it twice, first the real conference and
then the online incarnation (not to mention handing down 1,600+ pages of
proceedings down on us!).