# GeoSQA > A Benchmark for Scenario-based Question Answeringin the Geography Domain at High School Level In order to facilitate the use of our dataset for different tasks, we provide two json files, `dataset_release.json` and `dataset_release_no_image.json`, one with binary image data and the other without. The structure of the json file as follow: ```json [ { "free-form_annotation": "Annotation diagram without template", "templated_annotation": "Annotation diagram with template", "category": "The category of the diagram, using symbols | to split multiple categories", "optionA": "The content of option A", "optionB": "The content of option B", "optionC": "The content of option C", "optionD": "The content of option D", "scenario_diagram": "Binary data of the original diagram encoded by base64", "question": "The content of question", "scenario_id": "The id of scenario, ranging from 0 to 1980", "question_id": "The id of question, ranging from 0 to 4109", }, { ... } ] ``` You can load the data with the following python code: ``` python import pandas as pd pd.read_json(open("dataset_processed.json", encoding="UTF8").read(), orient='records') ``` We also provide templates.json, which contains all the categories used for labeling and the corresponding templates for each category. The structure of templates.json as follow: ``` json [ { "id": "The ID of first layer category", "label": "The label of first layer category", "children": [ { "id": "The ID of second layer category", "label": "The ID of second layer category", "children": [ ... ], "templates": [ "The templates of second layer category", "..." ] } ], "templates": [ "The templates of first layer category", "..." ] }, { ... } ] ``` As for the structure of template, we use the symbol $ to indicate the slot, and the content of bracket indicates the explanation of a slot.