Rules and Requirements

No private sharing outside teams

Privately sharing code or data outside of teams is not permitted. It's okay to share code if made available to all participants.

Team Limits

There is no maximum team size, but the size of the honorarium remains constant regardless of the number of people within the team.

Submission Limits

There are no submission limits, however, only the last submission sent before 5:00 PM EST will be evaluated for that day’s leaderboard updates.
You may select up to 1 final submission for judging.

Team Limits

There is no maximum team size, but the size of the honorarium remains constant regardless of the number of people within the team.


CHALLENGE HOST: The Learning Agency Lab
CHALLENGE HOST ADDRESS: 2307 S Rural Rd, Tempe, AZ 85282, USA

Each participant Team will be comprised of a single Team Lead (as defined herein) and any number of Team Members (as defined herein). A Team may be comprised of a Team Lead and no other Team Members. A team lead (“Team Lead”) is the individual selected by the Challenge Host for a Team, who will be the primary point of contact between the Challenge Host and such Team. Team Leads must meet the eligibility requirements of these Rules (as defined herein). The platform for the Challenge will have one account for every Team Lead, which is the only valid submission point for Submissions for such Team Lead’s Team. Team Leads are the only individuals eligible to receive the Honorariums.

A team member (“Team Member”) is any individual working on a Team under a Team Lead. There is no limit on how many Team Members may be part of a Team. Team Members do not have an account on the platform for the Challenge, and will not receive an Honorarium for the Challenge.

Challenge is open to residents of the United States and worldwide, except that if you are a resident of Cuba, Iran, Syria, North Korea, Russia, Sudan, Syria, or the Crimea, Donetsk or Luhansk region of Ukraine, or are subject to U.S. export controls or sanctions, you may not enter the Challenge. Other local rules and regulations may apply to you, so please check your local laws to ensure that you are eligible to participate in skills-based competitions.


These Challenge Rules are subject to change at any time. The Lab will use commercially reasonable efforts to notify participants of any changes to these Rules, including by posting the modified Rules on the Challenge Website at (the “Site”).

The Challenge named above is a skills-based challenge to promote and further the field of data science. Your challenge submissions ("Submissions") must conform to the requirements stated on the Challenge Website. Your Submissions will be scored based on the evaluation metric described on the Challenge Website. See below for the complete Challenge Rules.


In addition to the provisions of the General Challenge Rules below, you understand and agree to these Challenge-Specific Rules required by the Challenge Host:


Under Section 11 (Participants’ Obligations) of the General Rules below, you hereby grant and will grant Challenge Host the following license(s) with respect to your Submission as a Challenge participant:

Open Source: You hereby license and will license your winning Submission and the source code used to generate the Submission under an Open Source MIT license (see that in no event limits commercial use of such code or model containing or depending on such code. To the extent your Submission makes use of generally commercially available software not owned by you that you used to generate your submission, but that can be procured by the Competition Sponsor without undue expense, you do not grant the license in the preceding sentence to that software.


Individual participants and Teams may use automated machine learning tool(s) (“AMLT”) (e.g., Google AutoML, H2O Driverless AI, etc.) to create a Submission, provided that the participant or Team ensures that they have an appropriate license to the AMLT such that they are able to comply with the Challenge Rules.


To enter the Challenge, you (both Team Leads and Team Members) must agree to these Official Challenge Rules, which include these General Challenge Rules and incorporate by reference the provisions and content of the Challenge-Specific Rules above (collectively, the "Rules"). Please read these Rules carefully to ensure you understand and agree. You further agree that submission of an entry in the Challenge constitutes agreement to these Rules. Team Leads may not submit an entry to the Challenge and are not eligible to receive the honorarium associated with this Challenge (“Honorarium”) unless they agree to these Rules. These Rules form a binding legal agreement between you and the Challenge Host with respect to the Challenge.


A. To be eligible to enter the Challenge, you must be:
(i) the older of 18 years old or the age of majority in your jurisdiction of residence (unless otherwise agreed to by Challenge Host and appropriate parental/guardian consents have been obtained by Challenge Host);
(ii) not a resident of Cuba, Iran, Syria, North Korea, Russia, Sudan, Syria, or the Crimea, Donetsk or Luhansk region of Ukraine; and
(iii) not a person or representative of an entity under U.S. export controls or sanctions (see

If you are entering as a representative of a company, educational institution or other legal entity, or on behalf of your employer, the Rules are binding on you, individually, and the entity you represent or are an employee. If you are acting within the scope of your employment, as an employee, contractor, or agent of another party, you warrant that such party has full knowledge of your actions and has consented thereto, including your receipt of an Honorarium (if participating as a Team Lead). You further warrant that your actions do not violate your employer's or entity's policies and procedures.

The Challenge Host reserves the right to verify eligibility and to adjudicate on any dispute at any time. If you provide any false information relating to the Challenge concerning your identity, residency, mailing address, telephone number, email address, ownership of right, or information required for entering the Challenge, you may be immediately disqualified from the Challenge.

B. Unless otherwise stated in the Rules or prohibited by internal policies of the Challenge Entities, employees, interns, contractors, officers and directors of Challenge Entities may enter and participate in the Challenge, but are not eligible to receive the Honorarium. "Challenge Entities" means the Challenge Host, and their respective parent companies, subsidiaries and affiliates. If you are such a participant from a Challenge Entity, you are subject to all applicable internal policies of your employer with respect to your participation.


The Challenge will run from the Start Date (October 28, 2022) to the Final Submission Deadline (February 10, 2023) (such duration, the “Challenge Period”). The Challenge Period is subject to change, and Challenge Host may introduce additional hurdle deadlines during the Challenge Period. Any updated or additional deadlines will be publicized on the Challenge Website. It is your responsibility to check the Challenge Website regularly to stay informed of any deadline changes.


NO PURCHASE NECESSARY TO ENTER OR RECEIVE AN HONORARIUM (IN THE CASE OF TEAM LEADS). To participate in the Challenge as a Team Lead, you must be selected by the Challenge Host to participate as a Team Lead for a Team, and you must accept the terms of the honorarium agreement. Team Leads may choose the Team Members for his or her Team, subject to the Eligibility Requirements set forth in Section 2 of these Rules.


"Challenge Data" means the data or datasets available from the Challenge Website for the purpose of use in the Challenge, including any prototype or executable code provided on the Challenge Website. The Challenge Data will contain private and public test sets. Which data belongs to which set will not be made available to participants.

A. Data Access and Use.You may access and use the Challenge Data only for participating in the Challenge and on the official Slack channel that the Challenge Host shared. The Challenge Host reserves the right to disqualify any participant who uses the Challenge Data other than as permitted by the Challenge Website and these Rules.

B. External Data.You may use data other than the Challenge Data (“External Data”) to develop and test your Submissions. However, you will ensure the External Data is publicly available and equally accessible to use by all participants of the Challenge for purposes of the challenge at no cost to the other participants, and you shall not use any data from the FairyTale QA dataset that is not part of the Challenge Data. The ability to use External Data under this Section 5.C (External Data) does not limit your other obligations under these Challenge Rules, including but not limited to Section 9 (Participants Obligations).


A. Private Code Sharing.Unless otherwise specifically permitted under the Challenge Website or the Rules, during the Challenge Period, you are not allowed to privately share source or executable code developed in connection with or based upon the Challenge Data or other source or executable code relevant to the Challenge (“Challenge Code”). This prohibition includes sharing Challenge Code between separate Teams. Any such private sharing of Challenge Code is a breach of the Rules and may result in disqualification.

B. Public Code Sharing. You are permitted to publicly share Challenge Code, provided that such public sharing does not violate the intellectual property rights of any third party. If you do choose to share Challenge Code or other such code, you are required to share it on the hosting platform, on the discussion forum, or via notebooks associated specifically with the Challenge for the benefit of all competitors. By so sharing, you are deemed to have licensed the shared code under the License.


Each Submission will be scored and ranked by the evaluation metric stated on the Challenge Website. During the Challenge Period, the current ranking will be visible on the Challenge Website's public leaderboard. The potential winner(s) are determined solely by the leaderboard ranking on the private leaderboard, subject to compliance with the Rules. The public leaderboard will be based on the public test set and the private leaderboard will be based on the private test set.

In the event of a tie, the Submission that was entered first to the Challenge will be the winner. In the event a potential winner is disqualified for any reason, the Submission that received the next highest score rank will be chosen as the potential winner.


The potential winner(s) will be notified by email.

Challenge Host reserves the right to disqualify any participant from the Challenge if the Challenge Host reasonably believes that the participant has attempted to undermine the legitimate operation of the Challenge by cheating, deception, or other unfair playing practices or abuses, threatens or harasses any other participants or Challenge Host.

A disqualified participant may be removed from the Challenge leaderboard, at Challenge Host’s sole discretion.

The final leaderboard list will be publicly displayed at Determinations of Challenge Host are final and binding.


As a condition to being awarded an Honorarium, a Team Lead must fulfill the following obligations:

(a) deliver to the Challenge Host the final model's software code as used to generate the Submission and associated documentation. The delivered software code should follow the documentation guidelines, provided in Appendix A, and must be capable of generating the Submission, and contain a description of resources required to build and/or run the executable code successfully. To the extent that the final model’s software code includes generally commercially available software that is not owned by you, but that can be procured by the Challenge Host without undue expense, then instead of delivering the code for that software to the Challenge Host, you must identify that software, method for procuring it, and any parameters or other information necessary to replicate the Submission;

(b) grant to the Challenge Host the license to the Submission stated in these Rules above, and represent that you have the unrestricted right to grant that license;

(c) sign and return all honorarium acceptance documents as may be required by Challenge Host including without limitation: (i) eligibility certifications; (ii) licenses, releases and other agreements required under the Rules; and (iii) U.S. tax forms (such as IRS Form W-9 if U.S. resident, IRS Form W-8BEN if foreign resident, or future equivalents).


Honorariums are as described on the Challenge Website and shall be awarded to each Team Lead.

All Honorariums will be disseminated upon completion of the Challenge and review of required participant obligations. Honorariums are subject to Challenge Host's review and verification of the participant’s eligibility and compliance with these Rules, and the compliance of the Submissions with the Submissions Requirements, all in Challenge Host’s sole determination. In the event that the Submission demonstrates non-compliance with these Challenge Rules, Challenge Host may at its discretion take either of the following actions: (i) disqualify the Submission(s); or (ii) require the participant to remediate within one week after notice all issues identified in the Submission(s) (including, without limitation, the resolution of license conflicts, the fulfillment of all obligations required by software licenses, and the removal of any software that violates the software restrictions).

The distribution of the Honorarium among Team Members, if any, is at the sole discretion of the Team Lead. Team Leads may choose to share or not share the Honorarium with Team Members. The Competition Host is not responsible or liable for any direct payment to Team Members.

Challenge winners do not receive any additional prize or honorarium. A potential Challenge winner may decline to be nominated as a Challenge winner in accordance with Section 10.

You are not eligible to receive any Honorarium if you do not meet the Eligibility requirements in Section 2 above.

11. TAXES.

ALL TAXES IMPOSED ON HONORARIUMS ARE THE SOLE RESPONSIBILITY OF THE TEAM LEADS. Payments to Team Leads are subject to the express requirement that they submit all documentation requested by Challenge Host for compliance with applicable state, federal, local and foreign (including provincial) tax reporting and withholding requirements. Honorariums will be net of any taxes that Challenge Host is required by law to withhold. If a Team Lead fails to provide any required documentation or comply with applicable laws, the Honorarium may be forfeited. Any Team Leads who are U.S. residents will receive an IRS Form-1099 in the amount of their Honorarium.


All federal, state, provincial and local laws and regulations apply.


You agree that Challenge Host and its affiliates may use your name and likeness for advertising and promotional purposes without additional compensation, unless prohibited by law.


You acknowledge and agree that Challenge Host may collect, store, share and otherwise use personally identifiable information provided by you during the registration process and the Challenge, including but not limited to, name, mailing address, phone number, and email address (“Personal Information”).

As a controller of such Personal Information, Challenge Host agrees to comply with all U.S. and foreign data protection obligations with regard to your Personal Information. Your Personal Information will be transferred to Challenge Host in the country specified in the Challenge Host Address listed above, which may be a country outside the country of your residence. Such country may not have privacy laws and regulations similar to those of the country of your residence.


You warrant that your Submission is your own original work and, as such, you are the sole and exclusive owner and rights holder of the Submission, and you have the right to make the Submission and grant all required licenses. You agree not to make any Submission that: (i) infringes any third party proprietary rights, intellectual property rights, industrial property rights, personal or moral rights or any other rights, including without limitation, copyright, trademark, patent, trade secret, privacy, publicity or confidentiality obligations, or defames any person; or (ii) otherwise violates any applicable U.S. or foreign state or federal law.

To the maximum extent permitted by law, you indemnify and agree to keep indemnified Challenge Host, its affiliates, and its and their directors, officers, employees and agents (collectively, the “Challenge Entities”) at all times from and against any liability, claims, demands, losses, damages, costs and expenses resulting from any of your acts, defaults or omissions and/or a breach of any warranty set forth herein. To the maximum extent permitted by law, you agree to defend, indemnify and hold harmless the Challenge Entities from and against any and all claims, actions, suits or proceedings, as well as any and all losses, liabilities, damages, costs and expenses (including reasonable attorneys fees) arising out of or accruing from: (a) your Submission or other material uploaded or otherwise provided by you that infringes any third party proprietary rights, intellectual property rights, industrial property rights, personal or moral rights or any other rights, including without limitation, copyright, trademark, patent, trade secret, privacy, publicity or confidentiality obligations, or defames any person; (b) any misrepresentation made by you in connection with the Challenge; (c) any non-compliance by you with these Rules or any applicable U.S. or foreign state or federal law; (d) claims brought by persons or entities other than the parties to these Rules arising from or related to your involvement with the Challenge; and (e) your acceptance, possession, misuse or use of any Honorarium, or your participation in the Challenge and any Challenge-related activity.

You hereby release Challenge Entities from any liability associated with: (a) any malfunction or other problem with the Challenge Website; (b) any error in the collection, processing, or retention of any Submission; or (c) any typographical or other error in the printing, offering or announcement of winners.


Challenge Entities are not responsible for any malfunction of the Challenge Website or any late, lost, damaged, misdirected, incomplete, illegible, undeliverable, or destroyed Submissions or entry materials due to system errors, failed, incomplete or garbled computer or other telecommunication transmission malfunctions, hardware or software failures of any kind, lost or unavailable network connections, typographical or system/human errors and failures, technical malfunction(s) of any telephone network or lines, cable connections, satellite transmissions, servers or providers, or computer equipment, traffic congestion on the Internet or at the Challenger Website, or any combination thereof, which may limit a participant’s ability to participate.


If for any reason the Challenge is not capable of running as planned, Challenge Host reserves the right to cancel, terminate, modify or suspend the Challenge.


Under no circumstances will the entry of a Submission, the awarding of an Honorarium, or anything in these Rules be construed as an offer or contract of employment with Challenge Host or any of the Challenge Entities. You acknowledge that you have submitted your Submission voluntarily and not in confidence or in trust. You acknowledge that no confidential, fiduciary, agency, employment or other similar relationship is created between you and Challenge Host or any of the Challenge Entities by your acceptance of these Rules or your entry of your Submission.


Unless otherwise provided in these Rules, all claims arising out of or relating to these Rules will be governed by Arizona law, excluding its conflict of laws rules, and will be litigated exclusively in the Federal or State courts of Maricopa County, Arizona, USA. The parties consent to personal jurisdiction in those courts. If any provision of these Rules is held to be invalid or unenforceable, all remaining provisions of the Rules will remain in full force and effect.


Below are the standard expectations for FINAL Model Documentation. There are two standard components:

  1. Model Summary: Requirements detailed in section A
  2. Submission Model: Requirements detailed in section B


General Guidelines:
Documentation should be in Word or PDF format. It should be in English.
The below should be considered helpful guidance. You can ignore any questions that are not relevant. You should also add useful details that are not covered by the questions.

A1. Background on you/your team

  • Team Name:
  • Public Leaderboard Score:
  • Public Leaderboard Place:
[For each team member]
  • Name:
  • Location:
  • Email:

A2. Background on you/your team

If part of a team, please answer these questions for each team member. For larger teams (3+), please give shorter responses.

  • What is your academic/professional background?
  • Did you have any prior experience that helped you succeed in this competition?
  • What made you decide to enter this competition?
  • How much time did you spend on the competition?
  • If part of a team, how did you decide to team up?
  • If you competed as part of a team, who did what?

A3. Summary

In four to six sentences, summarize the most important aspects of your model and analysis, such as:

  • The training method(s) you used (Convolutional Neural Network, XGBoost)
  • The most important feature
  • The tool(s) you used
  • How long it takes to train your model with your computing resources

A4. Features Selection / Engineering

  • What were the most important features? (if applicable)
  • How did you select features? (if applicable)
  • Did you make any important feature transformations? (if applicable)
  • Did you find any interesting interactions between features? (if applicable)
  • Did you use external data?

A5. Training Method(s)

  • What training methods did you use?
  • Did you ensemble the models?
  • If you did ensemble, how did you weight the different models?

A6. Interesting findings

  • What was the most important trick you used?
  • What do you think set you apart from others in the competition?
  • Did you find any interesting relationships in the data that don't fit in the sections above?

A7. Simple Features and Methods

  • Is there a subset of features that would get 90-95% of your final performance? Which features (no more than 10)? (if applicable)
  • What model was most important (limited to one training method)?
  • What would be the simplified model score?

A8. Model Execution Time

  • How long does it take to train your model?
  • How long does it take to generate predictions using your model?
  • How long does it take to train the simplified model (referenced in section A7)?
  • How long does it take to generate predictions from the simplified model?
  • What computing resources did you use to get these execution times?

A9. References

Citations to references, websites, blog posts, and external sources of information where appropriate.


Models should be submitted in a single zip archive that contains all of the items detailed below.
Below are the requirements for documenting and delivering your solution. We ask for this information so that we have all the pieces needed to reproduce your solution with the score your team achieved on the leaderboard within a reasonable margin. Please make sure your code is well commented.

B1. All code, data, and your trained model goes in a single archive


Create a file at the top level of the archive. This file should concisely and precisely describe the following:

  • The hardware you used: CPU specs, number of CPU cores, memory, GPU specs, number of GPUs
  • OS/platform you used, including version number
  • Any necessary 3rd-party software, including version numbers, and installation steps. This can be provided as a Dockerfile instead of as a section in the readme
  • How to train your model
  • How to make predictions on a new test set
  • Important side effects of your code (ex: if your data processing code overwrites the original data)
  • Key assumptions made by your code (ex: if the outputs folder must be empty when starting a training run)

B3. Configuration files

Create a sub-folder with any necessary configuration files, such as `$HOME/.keras/keras.json`. The README should also include a description of what these files are and where they need to be placed to function.

B4. requirements.txt

Create a requirements.txt file at the top level of the archive. This should specify the exact version of all of the packages used, such as `pandas==0.23.0`. This can be generated with tools like `pip freeze` in Python or `devtools::session_info()` in R. The requirements file can also be replaced with a Dockerfile, as long as the installations all use exact version numbers.

B5. directory_structure.txt

Create a readout of the directory tree at the top level of the archive. This should be in the format generated by running the Linux command `find . -type d > directory_structure.txt` from the top level of your project folder.


This file specifies the path to the train, test, model, and output directories.

  1. This is the only place that specifies the path to these directories.
  2. Any code that is doing I/O should use the appropriate base paths from SETTINGS.json

B7. Serialized copy of the trained model

Save a copy of the trained model to disk. This enables code to use the trained model to make predictions on new data points without re-training the model (which is typically much more time-intensive). If model checkpoint files were part of your normal workflow, the README should list the path to the folder you saved them in.


A list of the commands required to run your code. As a best practice, separate training code from prediction code. For example, if you’re using python, there would be up to three entry points to your code:

  • python, which would:
    • Read training data from RAW_DATA_DIR (specified in SETTINGS.json)
    • Run any preprocessing steps (and specify seed state(s) if a random number generator is used)
    • Save the cleaned data to CLEAN_DATA_DIR (specified in SETTINGS.json)
  • python, which would:
    • Read training data from TRAIN_DATA_CLEAN_PATH (specified in SETTINGS.json)
    • Train your model. If checkpoint files are used, specify CHECKPOINT_DIR in SETTINGS.json. If a random number generator is used, specify the seed state(s).
    • Save your model to MODEL_DIR (specified in SETTINGS.json)
  • python, which would
    • Read test data from TEST_DATA_CLEAN_PATH (specified in SETTINGS.json)
    • Load your model from MODEL_DIR (specified in SETTINGS.json)
    • Use your model to make predictions on new samples
    • Save your predictions to SUBMISSION_DIR (specified in SETTINGS.json)