CSAIL Work Products

Belief Programming Implementation

2023-11-27T17:01:52Z

Belief Programming Implementation Atkinson, Eric

Multi-modal and Inertial sensor Solutions for Navigation-type Factor Graphs

2022-09-02T03:27:00Z

Multi-modal and Inertial sensor Solutions for Navigation-type Factor Graphs Fourie, Dehann This thesis presents a sum-product inference algorithm for platform navigation called Multi-modal iSAM (incremental smoothing and mapping). Common Gaussian only likelihoods are restrictive and require a complex front-end processes to deal with non-Gaussian measurements. Instead, our approach allows the front-end to defer ambiguities with non-Gaussian measurement models. We retain the acyclic Bayes tree (and incremental update strategy) from the predecessor iSAM2 max-product algorithm [Kaess et al., IJRR 2012]. The approach propagates continuous beliefs on the Bayes (Junction) tree, which is an efficient symbolic refactorization of the nonparametric factor graph, and asymptotically approximates the underlying Chapman-Kolmogorov equations. Our method tracks dominant modes in the marginal posteriors of all variables with minimal approximation error, while suppressing almost all low likelihood modes (in a non-permanent manner). Keeping with existing inertial navigation, we present a novel, continuous-time, retroactively calibrating inertial odometry residual function, using preintegration to seamlessly incorporate pure inertial sensor measurements into a factor graph. We centralize around a factor graph (with starved graph databases) to separate elements of the navigation into an ecosystem of processes. Practical examples are included, such as how to infer multi-modal marginal posterior belief estimates for ambiguous loop closures; raw beam-formed acoustic measurements; or conventional parametric likelihoods, and others.

Data and Code for "A New Approach to Animacy Detection"

2019-04-08T07:13:17Z

Data and Code for "A New Approach to Animacy Detection" Labiba, Jahan,; Geeticka, Chauhan,; A., Finlayson, Mark This archive contains the code and data for the workshop article "A New Approach to Animacy Detection," published in 2018 in the the 27th International Conference on Computational Linguistics (COLING 2018), in Santa Fe, NM. The root of the archive contains a readme file which explains the archive contents. Furthermore, the archive can be imported directly into the Eclipse IDE as a project encapsulating the executable code and data required to reproduce the results of the paper; the code compiles with Java 1.8. The archive also contains a copy of the near-final version of the paper for reference.

Data and Code for "Automatic Identification of Narrative Diegesis and Point of View"

2019-04-06T05:46:22Z

Data and Code for "Automatic Identification of Narrative Diegesis and Point of View" Eisenberg, Joshua D.; Finlayson, Mark A. This archive contains the code and data for the workshop article "Automatic Identification of Narrative Diegesis and Point of View," published in 2016 in the 2nd Workshop for Computing News Storylines (CNewsStory 2016), co-located with EMNLP 2016 in Austin, TX. The root of the archive contains a README file which explains the archive contents. Furthermore, the archive can be imported directly into the Eclipse IDE as a project encapsulating the executable code required to reproduce the results of the paper; the code compiles with Java 1.8. The archive also contains a copy of the final version of the paper for reference.

Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory"

2019-04-08T07:36:30Z

Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory" Finlayson, Mark Alan This archive contains the supplementary material for the journal article "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory", published in the Journal of Digital Scholarship in the Humanities (DSH), ca. 2016.The archive contains several different types of files. First, it contains the annotation guides that were used to train the annotators. The guides are numbered to match the team numbers in Table 6. Included here are not only detailed guides for some layers, as produced by the original developers of the specification, but also our synopsis guides for each layer, which were used as a reference and further training material for the annotators. Also of interest are the general annotator and adjudicator training guides, which outline the general procedures followed by the teams when conducting annotation. Those who are organizing their own annotation projects may find this material useful.Second, the archive contains a comprehensive manifest, in Excel spreadsheet format, listing the word counts, sources, types, and titles (in both Russian and English) of all the texts that are part of the corpus. Finally, the archive contains the actual corpus data files, in Story Workbench format, an XML-encoded stand-off annotation scheme. The scheme is described in the file format specification file, also included in the archive. These files can be parsed with the aid of any normal XML reading software, or can be loaded and edited easily with the Story Workbench annotation tool, also freely available.

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems (Supplementary Material)

2019-04-08T07:08:57Z

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems (Supplementary Material) Qi, Zichao; Long, Fan; Achour, Sara; Rinard, Martin We analyze reported patches for three prior generate-and-validate patch generation systems (GenProg, RSRepair, and AE). Because of errors in the patch evaluation infrastructure, the majority of the reported patches violate the basic principle behind the design of these systems they do not produce correct outputs even for the inputs in the test suite used to validate the patches. We also show that the overwhelming majority of the accepted patches are not correct and are equivalent to a single modification that simply deletes functionality. We also present Kali, a generate-and-validate patch generation system that only deletes functionality. Working with a simpler and more effectively focused search space, Kali generates at least as many correct patches as prior GenProg, RSRepair, and AE systems. Kali also generates at least as many patches that produce correct outputs for the inputs in the validation test suite as the three prior systems. We also discuss the patches produced by ClearView, a generate-and-validate binary hot patching system that leverages learned invariants to produce patches that enable systems to survive otherwise fatal defects and security attacks. Our analysis indicates that ClearView successfully patches 9 of the 10 security vulnerabilities used to evaluate the system. At least 4 of these patches are correct.

Staged Program Repair in SPR (Supplementary Material)

2019-04-08T08:33:47Z

Staged Program Repair in SPR (Supplementary Material) Long, Fan; Rinard, Martin We present SPR, a new program repair system that uses condition synthesis to instantiate transformation schemas to repair program defects. SPR's staged repair strategy combines a rich space of potential repairs with a targeted search algorithm that makes this space viably searchable in practice. This strategy enables SPR to successfully find correct program repairs within a space that contains many correct patches. The majority of these correct patches are not within the search spaces of previous automatic program repair systems.

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems (Supplementary Material)

2019-04-08T07:42:16Z

Supplementary Materials for "A Survey of Corpora in Computational and Cognitive Narrative Science"

2019-04-08T07:46:32Z

Supplementary Materials for "A Survey of Corpora in Computational and Cognitive Narrative Science" Finlayson, Mark Alan This archive contains supplementary materials for the article titled "A Survey of Corpora in Computational and Cognitive Narrative Science" by Mark A. Finlayson, published in the journal *Sprache und Datenverarbeitung*. The archive contains two files. The first file is the raw bibliographic data of the survey, containing 2600+ citations. The second file is a spreadsheet with the coded features of each corpus, plus the analyses that underlie sections 3 & 4 of the paper.

The N2 Corpus v1.0

2019-04-08T08:23:34Z

The N2 Corpus v1.0 Finlayson, Mark A.; Halverson, Jeffry R.; Corman, Steven R. The N2 Corpus (Narrative Networks Corpus) comprises 100 story texts (42,480 words) relevant to Islamist Extremism, drawn from religious stories, online material, and promotional magazines. The corpus has been annotated for 14 different layers of syntax and semantics. This v1.0 version is missing 33 texts that will be added in later versions. The corpus is described in: Mark A. Finlayson, Jeffry R. Halverson, and Steven R. Corman (2014) "The N2 Corpus: A semantically annotated collection of Islamist extremist stories", Proceedings of the 9th Language Resources and Evaluation Conference (LREC), Reykjavik, Iceland.

Code for Java Libraries for Accessing the Princeton Wordnet: Comparison and Evaluation

2019-04-08T08:24:24Z

Code for Java Libraries for Accessing the Princeton Wordnet: Comparison and Evaluation Finlayson, Mark Alan This archive contains the code and data for running the evaluations described in: Finlayson, Mark Alan (2014) "Java Libraries for Accessing the Princeton Wordnet: comparison and Evaluation" in Proceedings of the 7th Global Wordnet Conference (GWC 2014). Tartu, Estonia, 25-29 January 2014. The archive contains five Eclipse projects (compatible with Eclipse 3.8.0) that may be imported directly into an Eclipse workspace. You will need a Java 1.4, 1.5, and 1.6 JRE to run all the code in the archive. Paper abstract: Java is a popular programming language for natural language processing. I compare and evaluate 12 Java libraries designed to access the information in the original Princeton Wordnet databases. From this comparison emerges a set of decision criteria that will enable a user to pick the library most suited to their purposes. I identify five deciding features: (1) availability of similarity metrics; (2) support for editing; (3) availability via Maven; (4) compatibility with retired Java versions; and (5) support for Enterprise Java. I also provide a comparison of other features of each library, the information exposed by each API, and the versions of Wordnet each library supports, and I evaluate each library for the speed of various retrieval operations. In the case that the user's application does not require one of the deciding features, I show that my library, JWI, the MIT Java Wordnet Interface, is the highest-performance, widest-coverage, easiest-to-use library available.

Understanding the Performance of Broadband Networks through the Statistical Analysis of Speed Tests - Supplemental materials

2019-04-08T07:49:39Z

Understanding the Performance of Broadband Networks through the Statistical Analysis of Speed Tests - Supplemental materials García, Rubén Supplemental materials for the master thesis "Understanding the Performance of Broadband Networks Through the Statistical Analysis of Speed Tests", by Rubén García, submitted in May 2011 for the S.M. in Technology and Policy. Supplemental materials include: Source_code: Folder containing the source code for the statistical analysis of NDT speed tests, written for the R statistical package; NDT_data: Folder containing the following datasets (1) ndt4.h5: Initial NDT data that we used for the analysis; (2) ndt3.h5: Reduced version of the ndt4 dataset (same tests but less variables), also contains the 'whois' file that we combine with the NDT data in order to add location information; (3) comcast-ndt.h5: dataset containing the speed tests of a controlled experiment that we ran using different test durations; Aggregated_datasets: Versions of the ndt4.h5 dataset aggregated by IP and by Autonomous System.

jMWE v1.0.0

2019-04-08T08:05:23Z

jMWE v1.0.0 Finlayson, Mark Alan; Kulkarni, Nidhi jMWE is a Java library for constructing and testing Multi-Word Expression detectors. The library has three main facilities: (1) a detector API, (2) a MWE index facility, and (3) a test harness. This is version 1.0.0 of the library. It contains the source code, compiled binary files, javadocs, a user's manual (pdf), and data for constructing a default MWE index. The freely available version of jMWE is licensed for use for non-commercial purposes only, as long as proper acknowledgment is made. Details can be found in the license, which is included at the end of this document. The copyright on the software is owned by MIT; if you wish to use the software for commercial purposes, please contact the MIT Technology Licensing Office for more information on how to obtain a commercial license. "June 2011."

Source code and data for MWE'2011 papers

2019-04-05T16:44:13Z

Source code and data for MWE'2011 papers Finlayson, Mark Alan; Kulkarni, Nidhi Contains the source code and data necessary to run all computations described in the following two papers: Finlayson, Mark A. and Kulkarni, Nidhi (2011) "Detecting Multi-Word Expressions improves Word Sense Disambiguation", in Proceedings of the 2011 Workshop on Multiword Expressions, held at ACL'2011 in Portland, OR; Kulkarni, Nidhi and Finlayson, Mark A. (2011) "jMWE: A Java Toolkit for Detecting Multi-Word Expressions" in Proceedings of the 2011 Workshop on Multiword Expressions, held at ACL'2011 in Portland, OR.

UCM/MIT Indications, Referring Expressions, and Coreference Corpus (UMIREC corpus) v1.1

2019-04-08T07:42:59Z

UCM/MIT Indications, Referring Expressions, and Coreference Corpus (UMIREC corpus) v1.1 Finlayson, Mark Alan; Hervas, Raquel The corpus comprises 62 files in "Story Workbench" annotation format: 30 folktales in English from a variety of sources, and 32 Wall Street Journal articles selected to coincide with articles found in the Penn Treebank. The files are annotated with the location of referring expressions, coreference relations between the referring expressions, and so-called "indication structures", which split referring expressions into constituents (nuclei and modifiers) and mark each constituent as either 'distinctive' or 'descriptive', indicating whether or not the constituent contains information required for uniquely identifying the referent. The files distributed in this corpus archive are the gold-standard files, which were constructed by merging annotations done by two trained annotators. The contents of this corpus, the annotation procedure, and the indication structures are described in more detail in a paper titled "The Prevalence of Descriptive Referring Expressions in News and Narrative" published in the proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, held in July 2010 in Uppsala, Sweden (ACL-2010). A near-final version of the paper is included in the doc/ directory of the compressed corpus archive file. This is version 1.1 of the UMIREC corpus, in which the coreference annotations have been fixed relative to version 1.0. UMIREC v1.0 suffered from a bug in the export script that corrupted the coreference data.

UCM/MIT Indications, Referring Expressions, and Coreference Corpus (UMIREC corpus)

2019-04-06T15:23:32Z

UCM/MIT Indications, Referring Expressions, and Coreference Corpus (UMIREC corpus) Hervas, Raquel; Finlayson, Mark Alan This version of the UMIREC corpus has been superseded by version 1.1, found at http://hdl.handle.net/1721.1/57507. Please do not use version 1.0, as it contains corrupted coreference information. The correct, uncorrupted data is found in version 1.1.

Code for LOLCAT Method (Variant of Gillespie Algorithm)

2019-04-08T07:44:12Z

Code for LOLCAT Method (Variant of Gillespie Algorithm) Beal, Jacob; Indurkhya, Sagar This code and data is publicly listed code for the LOLCAT Method developed by Sagar Indurkhya and Jacob Beal, in the paper: "Reaction factoring and bipartite update graphs accelerate the Gillespie algorithm for large-scale biochemical systems."

Simple LCD Transmitter Camera Receiver Data Link

2019-04-10T09:16:10Z

Simple LCD Transmitter Camera Receiver Data Link Katabi, Dina; Raskar, Ramesh; Mohan, Ankit; Woo, Grace We demonstrate a freespace optical system using a consumer camera and projector in indoor environments using available devices for visual computing. Through design, prototype and experimentation with this commodity hardware, we analyze a practical optical solution as well as the drawbacks for current wireless challenges unmet by classic RF wireless communication. We summarize and introduce some new applications enabled by such similar setups.

Sepia: a Framework for Natural Language Semantics

2019-04-10T09:16:08Z

Sepia: a Framework for Natural Language Semantics Marton, Gregory Adam; Westrick, Linda Brown To help explore linguistic semantics in the context of computational natural language understanding, Sepia provides a realization the central theoretical idea of categorial grammar: linking words and phrases to compositional lambda semantics. The Sepia framework provides a language in which to express complex transformations from text to data structures, and tools surrounding that language for parsing and machine learning. Lambda semantics are expressed as arbitrary Scheme programs, unlimited in the semantic representations they may build, and the rules for transformation are expressed in Combinatory Categorial Grammar, though the details of grammar formalism may be easily changed. This report explains the major design decisions, and is meant to teach the reader how to understand Sepia semantics and how to create lexical items for a new language understanding task. Source code and technical description

Style Translation for Human Motion (Supplemental Material)

2019-04-06T02:56:35Z

Style Translation for Human Motion (Supplemental Material) Hsu, Eugene; Pulli, Kari; Popovic, Jovan Style translation is the process of transforming an input motion into a new style while preserving its original content. This problem is motivated by the needs of interactive applications, which require rapid processing of captured performances. Our solution learns to translate by analyzing differences between performances of the same content in input and output styles. It relies on a novel correspondence algorithm to align motions, and a linear time-invariant model to represent stylistic differences. Once the model is estimated with system identification, our system is capable of translating streaming input with simple linear operations at each frame.

Interactive Simulation of Stylized Human Locomotion

2019-04-08T07:59:27Z

Interactive Simulation of Stylized Human Locomotion Silva, Marco da; Popovic, Jovan; Abe, Yeuhi Animating natural human motion in dynamic environments is difficult because of complex geometric and physical interactions. Simulation provides an automatic solution to parts of this problem, but it needs control systems to produce lifelike motions. This paper describes the systematic computation of controllers that can reproduce a range of locomotion styles in interactive simulations. Given a reference motion that describes the desired style, a derived control system can reproduce that style in simulation and in new environments. Because it produces high-quality motions that are both geometrically and physically consistent with simulated surroundings, interactive animation systems could begin to use this approach with more established kinematic methods.

Guided Time Warping for Motion Editing

2019-04-08T07:29:53Z

Guided Time Warping for Motion Editing Hsu, Eugene; Silva, Marco da; Popovic, Jovan Time warping allows users to modify timing without affecting poses. It has many applications in animation systems for motion editing, such as refining motions to meet new timing constraints or modifying the acting of animated characters. However, time warping typically requires many manual adjustments to achieve the desired results. We present a technique which simplifies this process by allowing time warps to be guided by a provided reference motion. Given few timing constraints, it computes a warp that both satisfies these constraints and maximizes local timing similarities to the reference. The algorithm is fast enough to incorporate into standard animation workflows. We apply the technique to two common tasks: preserving the natural timing of motions under new time constraints and modifying the timing of motions for stylistic effects.

Style Translation for Human Motion

2019-04-05T16:19:02Z

Style Translation for Human Motion Hsu, Eugene; Pulli, Kari; Popovic, Jovan Style translation is the process of transforming an input motion into a new style while preserving its original content. This problem is motivated by the needs of interactive applications, which require rapid processing of captured performances. Our solution learns to translate by analyzing differences between performances of the same content in input and output styles. It relies on a novel correspondence algorithm to align motions, and a linear time-invariant model to represent stylistic differences. Once the model is estimated with system identification, our system is capable of translating streaming input with simple linear operations at each frame.

Example-Based Control of Human Motion

2019-04-08T07:29:53Z

Example-Based Control of Human Motion Hsu, Eugene; Gentry, Sommer; Popovic, Jovan In human motion control applications, the mapping between a control specification and an appropriate target motion often defies an explicit encoding. We present a method that allows such a mapping to be defined by example, given that the control specification is recorded motion. Our method begins by building a database of semantically meaningful instances of the mapping, each of which is represented by synchronized segments of control and target motion. A dynamic programming algorithm can then be used to interpret an input control specification in terms of mapping instances. This interpretation induces a sequence of target segments from the database, which is concatenated to create the appropriate target motion. We evaluate our method on two examples of indirect control. In the first, we synthesize a walking human character that follows a sampled trajectory. In the second, we generate a synthetic partner for a dancer whose motion is acquired through motion capture.

Simulation of Human Motion Data using Short-Horizon Model-Predictive Control

2019-04-08T08:07:08Z

Simulation of Human Motion Data using Short-Horizon Model-Predictive Control Silva, Marco da; Abe, Yeuhi; Popovic, Jovan Many data-driven animation techniques are capable of producing high quality motions of human characters. Few techniques, however, are capable of generating motions that are consistent with physically simulated environments. Physically simulated characters, in contrast, are automatically consistent with the environment, but their motionsare often unnatural because they are difficult to control. We present a model-predictive controller that yields natural motions by guiding simulated humans toward real motion data. During simulation, the predictive component of the controller solves a quadratic program to compute the forces for a short window of time into the future. These forces are then applied by a low-gain proportional-derivative component, which makes minor adjustments until the next planning cycle. The controller is fast enough for interactive systems such as games and training simulations. It requires no precomputation and little manual tuning. The controller is resilient to mismatches between the character dynamics and the input motion, which allows it to track motion capture data even where the real dynamics are not known precisely. The same principled formulation can generate natural walks, runs, and jumps in a number of different physically simulated surroundings.

Factors Affecting the Adoption of Faculty-Developed Academic Software: A Study of Five iCampus Projects

2019-04-12T08:38:35Z

Factors Affecting the Adoption of Faculty-Developed Academic Software: A Study of Five iCampus Projects Ehrmann, Stephen C.; Gilbert, Steven W.; McMartin, Flora Initiated in 1999, iCampus is a research collaboration between Microsoft Research and MIT whose goal is to create and demonstrate technologies with the potential for revolutionary change throughout the university curriculum.” The program was made possible by a $25 million research grant from Microsoft to MIT, and involves extensive collaboration between MIT and Microsoft staff.

This assessment study by the TLT Group addresses the question: The TLT Group has been asked, “In light of the experience of iCampus, especially those projects selected by MIT and Microsoft for close study, what can be learned about priorities for educational technology initiatives in the future and about how the spread of such innovations can be more effectively supported?”

The major conclusions are that the five projects studied improved important elements of an MIT education by making learning more authentic, active, collaborative, and feedback-rich. Nevertheless, wider adoption beyond MIT was extremely difficult to achieve, largely due to structure issues in universities that make it difficult for educational technology to spread beyond the initial innovators, even to other departments within the same institution. The report includes recommendations for universities, external sponsors, and for MIT in particular, about steps to take to achieve more effective dissemination.

Table 2 (Supplemental): Complete data for all 100 expression programs discovered by GeneProgram from the Novartis Gene Atlas v2

2019-04-12T07:37:33Z

Table 2 (Supplemental): Complete data for all 100 expression programs discovered by GeneProgram from the Novartis Gene Atlas v2 Gerber, Georg K.; Dowell, Robin D.; Jaakkola, Tommi S.; Gifford, David K. Table 2 (Supplemental): Complete data for all 100 recurrent expression programs (EPs) discovered by GeneProgram. Each EP has two identifying rows, a list of meta-genes, and a list of significantly enriched GO categories. The first identifying row has three columns: (1) the EP identifier (an arbitrarily assigned number), (2) the number of meta-genes in the EP, and (3) the percentage of samples the EP occurs in. The identifying row lists all tissues that use the EP (h_ = human tissue, m_ = mouse tissue). Numbers in parentheses next to each tissue indicate the degree to which the tissue uses the EP.After the identifying rows the set of meta-genes in the EP are listed. Each meta-gene has eight columns: (1) the human RefSeq identifier, (2) the mouse RefSeq identifier, (3) the empirical mean expression level, (4) the empirical mean occurrence percentage, (5) the human gene name, (6) the human Swis-Prot description, (7) the mouse gene name, and (8) the mouse Swis-Prot description.Following the meta-genes are lists of significant GO categories (the first list uses human annotations, and the second uses mouse annotations). The columns for each line in this list are: (1) GO term, (2) enrichment p-value, (3) number of genes in the EP in the category/total genes in the EP with some GO category, (4) category description, and (5) total number of genes in the category that are also in the dataset analyzed.

Table 1 (Supplemental): Summary of expression programs discovered by GeneProgram from Novartis Tissue Atlas v2 data

2019-04-12T08:38:30Z

Table 1 (Supplemental): Summary of expression programs discovered by GeneProgram from Novartis Tissue Atlas v2 data Gerber, Georg K.; Dowell, Robin D.; Jaakkola, Tommi S.; Gifford, David K. Table 1 (Supplemental): Summary of recurrent expression programs (EPs) discovered by GeneProgram. The columns are: (1) the EP identifier (an arbitrarily assigned number), (2) the number of genes in the EP, (3) the number of tissues in the EP, (4) the species using the EP (i.e., one or more tissues from the species uses the EP, H = human, M = mouse), (5) the generality score (GS), (6) the top three tissues using the EP (numbers in parentheses = usage percentages), (7)-(9) the GO category name, GO term, and associated p-value for the most abundant significantly enriched category (i.e., the significant category with the most genes overlapping with the EP's genes).

The Creation of OpenCourseWare at MIT

2025-07-24T17:44:16Z

The Creation of OpenCourseWare at MIT Abelson, Harold This paper traces the genesis of the MIT OpenCourseWare project from its initial strategic precursors in 1999 and 2000, through its launch in 2001 and its subsequent evolution. The story told here illuminates the interplay among institutional leadership, and strategic planning, and with university culture in launching major educational technology enterprises. It also shows how initiatives can evolve in unexpected ways, and can even surpass their initial goals. The paper concludes with an overview of challenges facing OpenCourseWare in moving from the end of its production ramp-up and towards sustainability.

Principles for Engineered Emergence (slides)

2019-04-12T07:40:31Z

Principles for Engineered Emergence (slides) Beal, Jacob Principles for Engineered EmergenceIt is difficult to establish engineering control over the behavior ofaggregates of unreliable devices with complicated interactionpatterns. I take a linguistic view of this problem, searching formechanisms that simplify the composition and abstraction ofcomplicated behaviors. From my work on various problems of aggregatecontrol in cognitive architectures and spatial computing, I havenoticed common themes in mechanisms that solve them. From these, Iextract four principles which seem to help in engineering robustaggregate behavior---self-scaling, sparseness, gradual degradation,and failure simplification---and give examples of how they can beexploited.

Nuggeteer: Automatic Nugget-Based Evaluation Using Descriptions and Judgements

2019-04-12T13:39:31Z

Nuggeteer: Automatic Nugget-Based Evaluation Using Descriptions and Judgements Marton, Gregory TREC Definition and Relationship questions are evaluated on thebasis of information nuggets that may be contained in systemresponses. Human evaluators provide informal descriptions of eachnugget, and judgements (assignments of nuggets to responses) for eachresponse submitted by participants.The best present automatic evaluation for these kinds of questions isPourpre. Pourpre uses a stemmed unigram similarity of responses withnugget descriptions, yielding an aggregate result that is difficult tointerpret, but is useful for relative comparison. Nuggeteer, bycontrast, uses both the human descriptions and the human judgements,and makes binary decisions about each response, so that the end resultis as interpretable as the official score.I explore n-gram length, use of judgements, stemming, and termweighting, and provide a new algorithm quantitatively comparable to,and qualitatively better than the state of the art.