{ "dataNeeds": [ { "description": "A spatio-temporal dataset is a dataset which contains latitude/longitude and also time variables.\nI got a spatio-temporal dataset on this site. The dataset contains 1077 rows of data. The problem is, the size of the dataset is too small.\nIf anyone has another link to a spatio-temporal dataset with big size and CSV format and also free, please help.", "url": "https://opendata.stackexchange.com/questions/13749" }, { "description": "I am looking for dataset for journal editorial board members includes:\nname\ncountry\ninstitution\nAny help are welcome.\n\nif not available what is the best website can help me to collect such a data manually using web scraping tools.", "url": "https://www.reddit.com/r/datasets/comments/9r7lva/are_there_any_global_journals_editorial_board/" }, { "description": "I'm looking for any data about music, movie and software piracy. I know there exists datasets about pirated Oscar movies (http://waxy.org/2010/02/pirating_the_2010_oscars/) but I need more information (not only dates).\nBasicly I am going to analyze the impact of piracy on the markets. I don't know yet how to estimate number of people downloading illegal stuff, but for a start any data would be useful.", "url": "https://www.reddit.com/r/datasets/comments/n569v/ask_rdatasets_looking_for_datasets_about_piracy/" }, { "description": "Crunchbase looks most promising but they seem to be payware. Are there any free datasets of companies (established business & startups) to do some data science analysis? Ideally, the data should be name, a description of what it does, location(s), products/services offered etc.\nI am looking for US businesses", "url": "https://opendata.stackexchange.com/questions/12642" }, { "description": "I am looking for datasets containing data from a large number of devices. For e.g., the NYC Bus data set contains data from over 5000 buses, the NYC taxi dataset has data from all its taxis.\nAre there other such datasets, like data from 1000s of sensors/vehicles of some kind? The larger the number of sensors the better.", "url": "https://www.reddit.com/r/datasets/comments/a9uc74/datasets_from_large_number_of_devicessensors/" }, { "description": "Does any body know of any dataset of websites tagged by the verticals they are working or their types. For ex. Amazon.com: ecommerce, fb.com: social networking etc\nI want to cluster websites which are similar in terms of their verticals. I was looking for tagged dataset to achieve that. Any other approach is highly appreciated.", "url": "https://stackoverflow.com/questions/39051515" }, { "description": "Does anybody know of a data set that has been open sourced that contains significant (or all) the tables from and e-commerce backend. Examples could include:\nA small store (like a pet store)\nA non-profit's membership management system\nA small restaurant's backend", "url": "https://opendata.stackexchange.com/questions/6667" }, { "description": "Does anyone know of a dataset relating:\nbacteria (not viruses or fungi)\ninfectiousness\nsymptoms\n...or something of the sort?\nI'm looking to classify infectiousness (ie. infectious/not infectious) via bacterial genomic sequencing/metadata. I have the capability to match names to genome fragments from other sets.", "url": "https://opendata.stackexchange.com/questions/10669" }, { "description": "Does anyone know of a dataset that has metadata about news articles?\nBasic dataset requirements:\n100k+ news articles\nCovers 5+ publications\nSpans 15+ years of articles\nMetadata about each article with author names + year published\nMust be sourced legitimately (no TOS-breaking scraped data)\n\nI've looked through many potential sources. Here are some examples:\nNew York Times and the Guardian data archives -- fail criterion 2\nNews headlines of India -- fails criterion 4 and possibly 5\nVarious APIs for streaming news (Eventregistry, WebHose.io) -- all seem to be made for recent news, so they fail criterion 3\nReuter's data dumps -- fails criterion 3", "url": "https://stackoverflow.com/questions/50317406" }, { "description": "Good evening. For a research project I'm looking for a dataset that contains information on how cyber-related incidents were handled in companies (e.g. by CERTs). It is very hard to find some sources on the internet (I assume incident records belong the the most confidential datasets in a company).\n\nAnyways I found some basic sets that might be somehow applicable for my work - even if they do not meet 100% what I am looking for. I want to share them with you:\n\nCERT Societe Generale provides a couple (17) playbooks to handle incidents. A very small dataset without \"actual\" examples but at least it covers all kind of attacks: https://cert.societegenerale.com/en/publications.html\n\nFidelis Barncat provided me access to their MISP. They have >400,000 in their database, however I did not find any where an incident response was documented: https://www.fidelissecurity.com/resources/toolsandintel/barncat. Similar data is probably available on circl.lu, getting access is a bit more complicated as they require you to use official S/MIME from your company.\n\nISO-CERT (probably the most interesting one so far) provides some Mitigation information on 1100 incidents reported since 2010. https://ics-cert.us-cert.gov/advisories. Only downside is that they focus on incidents implied by software, i.e. malware trojans etc. but they provide no information on other relvant incidents like SE, Phishing, suspectful behaviour etc.\n\nLast but not least, to get a big database I came up with the idea to mine the data from a forum like bleepingcomputer.com . This is like my Plan C which I hopefully never will have to use.\n\nNow - and sorry for the long post - does any of you have an idea where to find a better dataset, also containing non-technical incidents like Social Engineering, Phishing and so on? Any keywords I might have mist (find a list attached) any very good reason why I will never find better results than those that I have on the internet?\n\nCheers\n\nKeywords I used for search (in all combinations) : Mitigatoin, MISP, CERT, CSIRT, Intrusion Response / Recovery, Incident Response / Recovery, Fixing, Technical Procedures, Threat-indicator sharing, Cyber Threat Mitigation, Re-Mitigation, SIEM, Cyber Recovery, Cyber Countermeasure.", "url": "https://stackoverflow.com/questions/54544607" }, { "description": "Hey guys. I'm looking for datasets that show data of cloud instances on platforms like Azure and Amazon Web Services. In particular data that shows the infrastructure of the instance (e.g. assigned memory) against performance metrics (would vary based on type, but one example is network usage).\nVery grateful for any help given.", "url": "https://www.reddit.com/r/datasets/comments/a3oubt/dataset_for_cloud_computing_usage/" }, { "description": "I am looking for open biological datasets and open open health informatics datasets, to test some machine learning algorithms that I designed.\nSpecifically, I am looking for datasets in which each element has a binary label (e.g. true/false, dead/alive), or a real valued label (e.g. score=0.7/1), to run some supervised learning approaches.\nCan anyone send me to any open datasets of this kind?", "url": "https://opendata.stackexchange.com/questions/9205" }, { "description": "Hi I have a been set an assignment in university for which i have to use datasets with longitude and latitude data in but im struggling to find some so far i have looked at\n\nhttp://www.geonames.org/export/\n\nthis website offers it but not in a efficient way to read,\n\nhttp://www.police.uk/data\n\nbut the csv forms they offer are proving very hard to import into a mysql database\n\nand the last site is\n\nhttp://linkeddata.org/data-sets\n\nwhich doesnt seem to contain any datasets with longitude and latitude data,\n\ni was wondering if anyone had any ideas of where i could find datasets that meet the criteria that i am looking for and are also free.\n\nThank you in advance", "url": "https://stackoverflow.com/questions/8097095" }, { "description": "Hey Reddit,\nI'm looking for datasets in advertising. I'm not looking for something in particular but anything that has a link with advertising. For example, it could be the number of billboards in a given country, the number of tv spot displayed per year etc...\nThe reason why I need this is because I'm starting a thesis to complete my master about advertising and its influence on consumers from a \"happiness\" point of view (i.e. not from ROI). And I was looking for data in advertising.\nI did some googling but to my great surprise I came across nothing else than figures and turnover in advertising which I'm not really interested in, except this as well:\nhttps://ourworldindata.org/grapher/enforcement-of-bans-on-tobacco-advertising\nAny idea where I could find data?\nThanks for reading!", "url": "https://www.reddit.com/r/datasets/comments/9uzapy/dataset_in_advertising_area/" }, { "description": "I am badly looking for a small size (around 50 nodes) link prediction dataset, with node attributes (preferably real-valued attributes). The network can be directed or undirected (preferably directed), but the network preferably not be bipartite. Of course I am looking for a free, ready to use dataset. Perhaps related to a social network, author communications, etc. Does any one knows a specific dataset? I really really appreciate showing me one.", "url": "https://stackoverflow.com/questions/12657958" }, { "description": "I am doing a data mining project on \"health prediction system\". So, Is there any open dataset containing data for disease and symptoms.", "url": "https://opendata.stackexchange.com/questions/6284" }, { "description": "I am doing my research in wildlife vehicle collision avoidance. for that I am in need of the thermal imaging dataset of deer.", "url": "https://opendata.stackexchange.com/questions/14110" }, { "description": "I'm looking for Spanish training dataset AnCora for CoreNLP, specifically this one IARG-AnCora Spanish (AnCora 3.0.1). The website requires a registration. I created an account, tried to register on the website, but account has never been activated. Any help would be appreciated. Thanks, Victor", "url": "https://stackoverflow.com/questions/52864491" }, { "description": "I am looking for a clustering dataset with \"ground truth\" labels for some known natural clustering, preferably with high dimensionality.\n\nI found some good candidates here (http://cs.joensuu.fi/sipu/datasets/), but only the Glass and Iris data-sets have labels for the points. I also found some code to generate Gaussian datasets (SynDECA). The main reason I want this is to compare distance metrics for some clustering methods. It's difficult to use external (extrinsic) evaluation criteria as many of those are biased towards euclidean distances; and there are so many to choose from.\n\nThanks!", "url": "https://stackoverflow.com/questions/22619904" }, { "description": "I am looking for a corpus of text to run some trial fulltext style data searches across. Either something I can download, or a system that generates it. Something a bit more random would be better e.g. 1,000,000 wikipedia articles in a format easy to insert into a 2 column database (id, text).\n\nAny ideas or suggestions?", "url": "https://stackoverflow.com/questions/3095813" }, { "description": "I am trying to some analysis on NEST based thermostat and SMOKE CO detectors. How to get the sample data of temparture recordings, humidity etc without purchasing the device, looking for sample datasets with some dummy customer details.\n\nIf not at least a way from virtual devices?", "url": "https://stackoverflow.com/questions/29350745" }, { "description": "I am looking for a dataset of hotel prices, I am couldn't find any public dataset. Anyone to have an idea on where can I find these information ?", "url": "https://opendata.stackexchange.com/questions/9293" }, { "description": "I am looking for a dataset on the location, capacity, company / owner and year of establishment for data centers (eg. of Google, Amazon, Facebook).\n\nAny suggestiosn are welcome.", "url": "https://opendata.stackexchange.com/questions/13111" }, { "description": "I am looking for a dataset that can I obtain about Russia's transportation. I am working on a PDF of the former Soviet Union's St. Petersburg map that I have here and need to georeference it. Is there a way that can I go to a website where there is an English version? The map I have is written in Russian.", "url": "https://opendata.stackexchange.com/questions/11765" }, { "description": "I am looking for a dataset which contains large quantities of relation tuples. For example, the search of \"people\" and \"location\" yields \"lives in\", \"worked in\", etc. University of Washington's OpenIE http://OpenIE.cs.washington.edu is a good tool but their dataset is only accessible through web. Where can I download a database or library like this?", "url": "https://stackoverflow.com/questions/29591485" }, { "description": "I am looking for a ontology dataset, which can describe the concept: \"Project\". For example:\nthe type of the project(new project, improvement, translate...)\nindustries involved in the project(mobile application, design, music)\nproducts of the project\nbut i can't even find a keyword for such a case.\nAre there any suitable datasets for my case?", "url": "https://stackoverflow.com/questions/27780291" }, { "description": "I am looking for a preferably open dataset that contains information regarding the dates that that employees took days off their annual leave for the countries of the EU. Is there any similar information available for other countries anyone is aware of?\nTo avoid misunderstanding: I'm not looking for public holiday dates themselves.", "url": "https://opendata.stackexchange.com/questions/7184" }, { "description": "I am looking for a sizable training dataset to train a machine learning algorithm.\nI concretely want to train it to detect trash in urban areas, so I would like photos of clean streets vs streets with garbage in them.\nSo far, the only repository of datasets I have found in the web is this one. Any other place I should check? Any suggestion to get a dataset through other means?", "url": "https://stackoverflow.com/questions/35996402" }, { "description": "I am looking for a large sample dataset (preferably in csv format) that has lat/lng coordinates.\nPostgreSQL,PostGIS", "url": "https://stackoverflow.com/questions/5035794" }, { "description": "Hi,\nI am a Masters student and my project is on Semantic Analysis and Topic Modelling of Game Reviews.\nI was trying to scrape data from game review websites such as gamestop.com with Rvest yet in their ToS data scraping is not allowed unless authorized. I have sent emails to few websites like this and received no reply after a month.\n\nI am looking for datasets which will include full game reviews with game names, scores and date of the review.\nThe data can be also from steam.\n\nThanks!", "url": "https://www.reddit.com/r/datasets/comments/alco6d/game_review_data/" }, { "description": "I am looking for an annotated dataset in German similar to the well-known English IMDB movie review dataset (here).\nThe background is that I would like to categorize German texts into multiple categories (starting with positive sentiment / negative / neutral).\nI have not found German word embeddings pre-trained with sentiment analysis, neither have I found a suitable dataset to train my own word embeddings with.\nAny advice would be appreciated!", "url": "https://stackoverflow.com/questions/55497422" }, { "description": "I'm looking for data sources providing high temporal frequency wind data. Once per minute would be preferred. The TCOONS dataset is a great example. I identified this after speaking with a local National Weather Service office, who directed me to it.\n\nIs there any nationwide listing of such datasets? Most of the wind data available appear to be daily or 20 minute averages......", "url": "https://opendata.stackexchange.com/questions/6921" }, { "description": "I am looking for dataset on the outcomes of Formula 1 races, what cars partook in the race, and the specifications of these cars (such as type of tire used; type of engine; width, length and other parameters that describe the shape of the car).\n\nIf the dataset contained information about the drivers as well as the weather conditions and geographic information of the race track that would be all the more better.\n\nI can't seem to find any datasets on Formula 1 racing, so I was wondering if anyone could suggest one?", "url": "https://opendata.stackexchange.com/questions/6557" }, { "description": "I am looking for dataset which contains data about mother's education and father's education for adults (data just for kids I've found to be not sufficient). I am writing my Thesis about effects of the parental education on individual's education and how it varies, so I would love to has the data for various low and middle income countries. Do you know, where can I find it ? Thank you in advance for any help\n\nP.S. DHS program should contain those variables in older releases( manual - see page 111) , but I found only data for Mother's education there (Household member's recode).", "url": "https://opendata.stackexchange.com/questions/7174" }, { "description": "I am looking for dataset(s) that classifies the GSR (skin conductivity) / heart rate readings to emotions and pain levels for machine learning purposes.", "url": "https://opendata.stackexchange.com/questions/11400" }, { "description": "I am looking for datasets containing judicial decisions in France.", "url": "https://opendata.stackexchange.com/questions/9302" }, { "description": "I'm looking for some OLAP data preferably in star schema (or snowflake) for testing a new tool. I've already got the Foodmart database that Mondrian provides. Type of data is not important as long as it has dimensions and associated facts. The larger the size the better for load testing. Anybody knows where I can download such a dataset, ideally in SQL or CSV? (other formats are fine too)", "url": "https://stackoverflow.com/questions/6181706" }, { "description": "I am looking for datasets for online donation for people who need help. Something like Kiva", "url": "https://opendata.stackexchange.com/questions/7282" }, { "description": "Hey r/datasets, I'm a PhD student in Math/stats and I'm doing some cluster computing NLP work for the next couple months. Specifically,I'm doing unsupervised text segmentation on (we'll say) news articles and I'd like to find some text that has already been segmented so I have a \"ground-truth\" to judge my model by.\nJust in case someone who reads this and has the dataset I'm looking for doesn't know what I mean, I'd like to have a dataset of documents where the documents have been cut into logical pieces based on their topics. Think about the front page of a newspaper, even if you concatenated all of the articles together a person could read through the long document and pull out where one article ended and another began just by the topics being discussed. I'd like a dataset that has the newspaper along with markers saying where one article ends and another beings.\nOne of the nice things about algorithms is that they don't actually \"know\" the difference between domains so a genome annotated by genes or strings of random words grouped by topics works as well as real text. Although, to be fair,a big part of these models is context dependent so a dataset related to pop culture/entertainment would be awesome!\nI know I'm asking for a lot so in the interest of fairness, I'd be happy to give something in return. I've been published doing data analysis in astrophysics, epidemiology and genetics and I'm publishing a paper about SVM theory later this year so if any of those things are relevant to you (or if there's something else I could help with) lets make our own version of the reddit exchange!", "url": "https://www.reddit.com/r/datasets/comments/392lmk/just_another_looking_for_dataset_post_annotated/" }, { "description": "I am looking for datasets related to search engine query logs. The dataset can be eCommerce related searches (Searches users do on amazon, alibaba, google, bing, duckduckgo etc). Searches related to politics on any search engine.\nAlso, can anyone link me to the AOL search query logs. I know that they exists publicly but I can't find them.", "url": "https://stackoverflow.com/questions/52092845" }, { "description": "Every official seaborn demo/example begins with sns.load_dataset(). I'm wondering where I can get those seaborn dataset so that I can follow the examples?\n\nI tried to find them myself using phrases like \"where to find official seaborn dataset\" etc, etc but got no hits.\n\nUPDATE:\nSo, how can I use them? I'm following http://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.boxplot.html, and this is the only thing that I get, i.e., I don't get any charts.\nBoth of my seaborn and pandas are working fine though. They are from my Anaconda installation, and are all of the latest versions. The version of matplotlib that I'm using works fine with seaborn as well.\n@gabra, I've found those csv files from the internet before asking the question, because I think they are just csv files, and can't be directly used in sns.load_dataset(xxx), right?", "url": "https://stackoverflow.com/questions/34111369" }, { "description": "I am looking for datasets that lists the location of accidents or traffic (latitude and longitude) with date and time in many countries.\nI found datasets for USA and UK, now looking for datasets for other countries.\nAny type of road accident would be great.", "url": "https://opendata.stackexchange.com/questions/11146" }, { "description": "I am looking for EEG brainwave dataset to classify depressed people from the healthy ones. I can find many for emotion recognition like happy, sad or neutral but I need specifically for depression detection. Like the eeg brainwave dataset of people already suffering from depression and the same for the healthy people.\nAny help from your side would be really appreciable.\nThanks in advance.", "url": "https://stackoverflow.com/questions/55134358" }, { "description": "I am looking for new datasets of documents, from which to extract the matrix terms-documents, to perform co-clustering algorithms.\nI am looking forsingle-label datasets only and prefer free access ones.\nI already know the following datasets.: \nCSTR\nWebKB4\nNewsgroups\nReuters\nK1A, K1B, wap (WebACE Project)\n\nDo you know of any others? \nYou also know of the new co-clustering algorithms created in the last two years? \nthanks", "url": "https://stackoverflow.com/questions/14978668" }, { "description": "I am looking for openFDA datasets for my current academic project. The project describes that a user enters medicine name and the result has to show me the information about the adverse events based on the medicine name. To do this kind of thing I must have a dataset. I did a lot of research but i am not able to get it. Please help me guys.", "url": "https://opendata.stackexchange.com/questions/3677" }, { "description": "I am in a class doing machine learning and we are working on a dataset for malware on microsoft computers. We are looking for a similar dataset for apple. We got the microsoft dataset from kaggle and are hoping (but not expecting) to get one for apple to make our project more balanced", "url": "https://www.reddit.com/r/datasets/comments/be2p6t/looking_for_apple_malware_dataset/" }, { "description": "I am looking for some large public datasets, in particular:\n\nLarge sample web server logs that have been anonymized.\nDatasets used for database performance benchmarking.\n\nAny other links to large public datasets would be appreciated. I already know about Amazon's public datasets at: http://aws.amazon.com/publicdatasets/", "url": "https://stackoverflow.com/questions/381806" }, { "description": "I am looking for specific customer support chat dataset or question answers dataset, for example network support, hardware, software support, ticket remedy chats etc. I need to mine the data for specific contents. Looking for data in text format:message: hi, response: hello", "url": "https://opendata.stackexchange.com/questions/12093" }, { "description": "I am looking for the dataset of NTIRE Workshop and Challenges @CVPR2018 as the title. Please tell me if you know. Below is a reference link.\nCodaLab - Competition https://competitions.codalab.org/competitions/18015", "url": "https://stackoverflow.com/questions/50483902" }, { "description": "I'm looking for datasets of street scenes recordings (images and/or videos), for computer vision prototypes applied to autonomous cars. I've found a lot of them (KITTI, Mapillary Vistas, CityScapes, SYNTHIA...), but all the licenses only allow education and research purposes. Do you know any allowing commercial use ?", "url": "https://www.reddit.com/r/datasets/comments/86j4ez/commercial_datasets_for_autonomous_driving/" }, { "description": "I am looking for twitter or other social networking sites dataset for my project. I currently have the CAW 2.0 twitter dataset but it only contains tweets of users. I want a data that shows the number of friends, follower and such.\nIt does not have to be twitter but I would prefer twitter or facebook. I already tried infochimps but apparently the file is not downloadable anymore for twitter.\nCan someone give me good websites for finding this kind of dataset. I am going to feed the dataset to hadoop.", "url": "https://stackoverflow.com/questions/3340810" }, { "description": "Hey All,\nI'm looking for a dataset with a bucket load of raw and post-processed digital photographs of people and/or landscapes. Quite niche I know.\nI want to do some AI testing on recognition and manipulation and estimate at the very least I'll need a data set of 10000 raw images and 10000 post processed images to see any kind of interesting result. The Raw images will be the image straight out of camera and the edited image will be a retouched/manipuated image.\nOn the off chance, anyone have an idea on where to aquire this sort of stuff? I've been thinking photographer's back catalogs and businesses that deal with e-commerce imagery but wondered if there was a pre-existing dataset anyone knows of?\nCheers!", "url": "https://www.reddit.com/r/datasets/comments/aps3g4/looking_for_a_dataset_of_pre_and_post_processed/" }, { "description": "I am new to machine learning and I am wondering where I can find good datasets with people information. For example I am interesting in dataset with people gender, number of children and their genders, occupation and maybe some more data about people.\nI searched a lot over the internet but still can't find anything which matches my needs.\nThe UCI Machine Learning Repository does not have the dataset I am looking for.\nAppreciate your help.", "url": "https://stackoverflow.com/questions/31771416" }, { "description": "I am relatively new to programming and therefore don't really know where to find specified datasets. What I am basically looking for is Sales Data from any Company (can be made up).\nThe data needs have coverage of the number of clients, the amount of money they paid and the date for each transaction. I want to predict the worth of each customer with this information, using the RFM-method on R.", "url": "https://opendata.stackexchange.com/questions/10002" }, { "description": "I'm searching for a dataset of at least 1000 examples of opinion articles from major news sources so that I don't have to scrape them myself. If it is including the content of the article that would be great, but I need at least the valid urls for each article -- alternatively, a dataset of general news articles labeled by category (so I can pick out the opeds).\nI've looked at NewsAPI but cannot filter for opeds, and any news data I've found does not label opinion articles, so anything you may know of will be a big help!", "url": "https://www.reddit.com/r/datasets/comments/b7gxfj/looking_for_a_dataset_of_published_opeds_or_news/" }, { "description": "I am working on a sentimental analysis of a french dataset in both R and Python. I know there is a dataset for english like AFINN where each word is rated for sentiment. I am looking for something similar for French where I could see a dataset which gives you numerical score for each of the words\nBelow is a simple query in R to get the sore of AFINN\n\nget_sentiments(\"afinn\")\n\nPlease let me know if there is any datasource available for French words.\nThanks\n\nget_sentiments(\"afinn\")", "url": "https://stackoverflow.com/questions/55598979" }, { "description": "I am working on an expert system project and I am looking for sample data-sets for computer system products (laptops, desktops, tablets, etc.). This data should include at least price and specifications.\nDo you know where I can find data which has this information?", "url": "https://opendata.stackexchange.com/questions/10852" }, { "description": "I am working on correction of errors in the output of ASR systems using data mining and NLP techniques, for that i need an n-gram dictionary. I started with wikipedia ngram it give encouraging result (75% detection rate) in small test set. But when i test my solution on a large dataset, the detection rate decreased because the wikipedia ngram in not large enough to cover all english words. So i'm looking for larger ngram collected from the web, i found the \"Google Web 1T 5-Grams\" but my laboratory don't have enough resource to purchase it. If anyone already have this dataset or know how to get it for free, please help.", "url": "https://stackoverflow.com/questions/36158501" }, { "description": "I am working on oil spill detection with imageJ. And I am looking for freely available dataset for Testing my implemented work. If anyone can help me...", "url": "https://stackoverflow.com/questions/29295136" }, { "description": "I am working on simulation sensor data for an industrial machine I can choose. For that I am looking for multiple data points from a single process, for example, multiple pressure and temperature curves to use as sort of blueprint.\nMy simulation does not focus on accurate simulation but instead on believable sensor data output.", "url": "https://opendata.stackexchange.com/questions/9971" }, { "description": "I cannot find the coordinates of the UK NUTS levels. There are resources which tell what the name and the code of NUTS regions are. But they do not provide what are their coordinates! I need a dataset which should include the NUTS name and the lat/long that surround the area. Data can be in any standard formats, e.g. CSV or Geojson.", "url": "https://opendata.stackexchange.com/questions/12951" }, { "description": "I have a problem to find the HUST-ASL Dataset (it's a hand gesture dataset acquired with Kinect). The creators mentioned it should be at http://mc.eistar.net/UpLoadFiles/File/hust_asl_dataset.zip, but that link doesn't work.\nHUST = Huazhong University of Science & Technology\nThe reference article is : \"Depth-Projection-Map-Based Bag of Contour Fragments for Robust Hand Gesture Recognition\".", "url": "https://opendata.stackexchange.com/questions/11868" }, { "description": "I have been looking for any general datasets about LGBT Americans. The type of file I look for are either CSV or TSV. Although my inquiry will hopefully yield some spatial indices at the county or state level, because of the apparent lack of initial success in searching I am hoping for data of some depth of any sort. Any suggestions or help is greatly appreciated.", "url": "https://opendata.stackexchange.com/questions/6961" }, { "description": "I have read the paper Understanding the Effectiveness of Video Ads: A Measurement Study (pdf), but couldn't find its dataset.\nDoes anyone here know about this dataset or another one that includes user behavior?\nOur data set is one of the most extensive cross-sections of enterprise videos used in a scientific study of this kind. The data used in our analysis was collected from 33 video providers over a period of 15 days consisting of 362 million videos and 257 million ad impressions that were watched by 65 million unique viewers located across the world. They used an Akamai client-side media analytics plugin.", "url": "https://opendata.stackexchange.com/questions/13512" }, { "description": "I am building a Named Entity Recognizer with a Conditional Random Field and am looking for two things:\n\nA) An open source, English NER dataset for Person, Location, and Organization entities\nB) A list of English NER features\n\nI have already looked at the CoNLL-2003 corpus and found this is exactly what I want but it is not readily available. I have been unsuccessful in finding a list of NER features; I am trying to avoid having to hand design these features.\nThanks", "url": "https://stackoverflow.com/questions/15045190" }, { "description": "I need to work on a research project in big data analysis. I am very interested in neuromarketing, user experience and neuroinformatics. I've searched for an interesting public dataset for my work but still haven't found one. I found one dataset: LSW dataset, but this doesn't seem to be what I'm looking for. Please help me to find some suggestions or links to download free and open source public data sets on neuromarketing and advertisements. by using neuromarketing data i plan to find out some psychological behavior patterns between Online marketing and Web UI designings. so I'm looking for \"online ads and number of click it had\" or \"website index pages with number of visitors\". and better to have region vice localized data.\nThanks in advance.", "url": "https://opendata.stackexchange.com/questions/5238" }, { "description": "I want to benchmark some (graph) databases and looking for some big, complex datasets. The dataset should have a size between 2 TB and 5 TB. Do you know any sample datasets (maybe open government or science data) which fullfills these criteria?", "url": "https://stackoverflow.com/questions/24935871" }, { "description": "I want to build a classifier which can classify gender and ethnicity based on the names. I am looking for datasets to download where I take samples for supervised learning. If there are no open datasets available for download, what are the ways that I can create one?", "url": "https://opendata.stackexchange.com/questions/12573" }, { "description": "I want to do a Social Network Analysis using Hadoop in Telecom industry. I'm looking for a dataset to work... There anyone knows a good dataset to analyze some relationships between users?\nMany thanks!", "url": "https://stackoverflow.com/questions/38746672" }, { "description": "I would like to do emotion classification on text (posts from social media e.g. tweets, facebook wall posts, youtube comments etc ...). Though I can't find a good dataset with annotated data. I'm looking for more than just data annotated with positive and negative. I'm looking for a dataset with several emotions. This could be or discrete values (ekman 6 basic emotions) or continues values (arousal-valence model). Does anyone know where I can get such a dataset, this can be from twitter, Facebook, Myspace ... as long it is from a social network", "url": "https://stackoverflow.com/questions/13127066" }, { "description": "I wonder if someone can help me to get a dataset to test Text Segmentation approach that I developed and want to test.\nI looked for Freddy Choi's dataset and I couldn't find it. I need this dataset specifically.\nIf someone has it or knows where I can get it, please advice.\nAlso if anyone has suggestions for other datasets for the same task, please advice.\nThanks", "url": "https://stackoverflow.com/questions/29306241" }, { "description": "Okay, so I've looked at the index data for the S&P500 Companies, but also at the sector they work in.\nI want a data that captures the graph structure of these companies, but I am not sure about anything concrete.\nI though about selecting all the companies as nodes and adding an edge if they are within the same sector, but the graph would be really \"sparse\" with multiple connected components.\n\nDo you have any idea what I should look for?", "url": "https://www.reddit.com/r/datasets/comments/aayvhl/looking_for_a_dataset_that_captures_a_graph/" }, { "description": "I have created several algorithms to solve the shortest path problem. I am looking for real data with longitude and latitude coordinates as well as distance for nodes of various cities.\nFor example here is a dataset I found, however it is probably not very compatible with Open Street Maps or perhaps be a bit outdated. For example some nodes seem to be random houses in the middle of nowhere. The nodes are always named from 0 to N. It contains information about Driving times between nodes, Walking times, and then the Latitude and Longitude of every single node. And that is done for several cities. One file even contains the whole USA graph.\nI am looking for something like this, that is more supported from Open Street Maps. Is there something similar in perhaps raw text format ? It would be ideal for my use case.", "url": "https://stackoverflow.com/questions/36010538" }, { "description": "I'm looking for a dataset of pharmacy Locations in Chicago area. The city of Chicago said they do not have this dataset, I'm wondering where I can find it.", "url": "https://opendata.stackexchange.com/questions/7099" }, { "description": "I'm looking for a dataset to do semantic scene labeling. I know that for indoor scenes there is NYU Depth v2. I also found some datasets recorded from a driving car in an outdoor setting. However I'm looking rather for a outdoor dataset that is recorded by a flying vehicle (drone). Does anyone have an idea?\n(I apologize if this is not the right place to ask this question, but I couldn't find a better suited one in the jungle of subforums :-/ )", "url": "https://stackoverflow.com/questions/37155132" }, { "description": "Hi all, I am working on a research project and I need help finding a Dataset for weekly temperature measurements for Anchorage Alaska over 5 years beginning in April 2009", "url": "https://www.reddit.com/r/datasets/comments/9v4bo3/looking_for_a_dataset_that_has_weekly_temperature/" }, { "description": "I'm looking for a good dataset of business locations, hopefully for all of the USA. I'd love to have \"name,\" \"business type,\" and \"lat/long,\" although I'd settle for \"street address\" rather than \"lat/long,\" and I could geocode the points myself.\nAre there any free or relatively cheap data sources for business locations? Can I get this information from google?", "url": "https://stackoverflow.com/questions/8028328" }, { "description": "I'm looking for a large dataset of tweets that have geolocation data (from the U.S.). Is there such a dataset available anywhere? I looked on infochimps, but didn't see anything.\nIf not, what's the best way to generate this dataset myself? Should I just run the Twitter Streaming API on my local machine (or maybe on AWS?), and then filter and save all geo-tagged tweets?", "url": "https://stackoverflow.com/questions/4623854" }, { "description": "I'm looking for a phenomenon, preferably within social sciences/economy, with available datasets that follow a sigmoidal relationship. For example, a relationship could be between the amount of money earned and the money spent. I would be grateful for any pointers.\nSteep sigmoidal input-output relationships are very common in biological systems (see the review by Steven Frank). The input in such a setting typically corresponds to the concentration of some chemical ligand, while the output is the concentration of another chemical downstream in a biochemical pathway that responds to the input. The relationship is also termed the dose response or, in engineering terms, the response characteristic. Biological systems (networks) that give steep sigmoidal responses to inputs are called \"ultra-sensitive\" and typically modelled with Hill equation where the Hill coefficient is greater than 1.\nIn principle, the sigmoidal relationship that I'm looking for, should exist between two variables, e.g. the duration of advertisement block on TV (x-axis) vs. the time to switch the channel (y-axis), etc. In the best case scenario such a relationship would be measured for a number of individuals. Therefore, I could have the entire \"response characteristic\" for an ensemble of, say, 100 people.\nWhat I'm not looking for, however, is the temporal behaviour of phenomena, e.g. the accumulation of wealth in time that could also follow a saturation-type sigmoidal curve.", "url": "https://opendata.stackexchange.com/questions/3996" }, { "description": "Hi\nI'm looking for a good dataset of global food prices, preferably by country. Also, historical data would be excellent too.\nThe purpose is to examine the cost of living around the world, in relation to the unemployment rate.\nDoes anyone have any good hints to help me get started? I found some things through searching but they weren't that spectacular.", "url": "https://www.reddit.com/r/datasets/comments/8vlktc/looking_for_global_food_price_dataset/" }, { "description": "I'm looking for a set of data that contains both numbers and strings (name/address, maybe), with a decent variety of data, around 1000 records, to test a JQuery-UI widget I'm developing. Does anyone know of such a dataset? Is there something floating around out there I could use?", "url": "https://stackoverflow.com/questions/6846648" }, { "description": "I wanted a dataset which categorizes many windows software into categories like news, music, video player, gaming, software development etc. Something like:\n\nVLC: video player,\nspotify: music,\nVisual studio code: software development etc.\n\nI searched on internet but couldn't find any such dataset, may be you guys could help me out. Thanks!", "url": "https://www.reddit.com/r/datasets/comments/axxlhw/looking_for_a_dataset_which_contains_many_windows/" }, { "description": "I'm looking for dataset for England clubs' players. No need for the statistics of clubs' win/lose. I'm looking for players' specific statistics. Like number of shoots or goals or position. Preferably if it includes historical data\nAny other clubs/league can be great too. (if the data are rich enough) Please some share resources", "url": "https://opendata.stackexchange.com/questions/5321" }, { "description": "I'm looking for Datasets for Autonomous Driving that contains labels for classification. Specifically, I'm looking for Dataset that has labels indicating if there is pedestrians in the image or not. I want to create classifier which can identify if the image contains pedestrians (some people). Anyone familiar with such dataset?", "url": "https://opendata.stackexchange.com/questions/13879" }, { "description": "I'm working on a project that requires analyzing e-mail text, but for obvious reasons, I don't have direct access to such data.\nSo I'm looking for a dataset that will have similar properties, such as length, language, style, etc, but so far had no luck. Any ideas?", "url": "https://www.reddit.com/r/datasets/comments/bdnrg7/looking_for_a_dataset_with_similar/" }, { "description": "I'm looking for datasets with people's dialogues, just everyday communication.\n\nExample:\nHow are you?\nFine, thank you, and how are you?", "url": "https://opendata.stackexchange.com/questions/13815" }, { "description": "The Drake equation is basically an equation that predicts how many intelligent civilizations are possible in a given galaxy based on several parameters. I'm looking for a large dataset that has (not necessarily limited to) these variables about a bunch of galaxies, or at least variables that can let me calculate them. Here are the variables I'm looking for:\n\nR∗ = the average rate of star formation in our galaxy\nfp = the fraction of those stars that have planets\nne = the average number of planets that can potentially support life per star that has planets\nfl = the fraction of planets that could support life that actually develop life at some point\nfi = the fraction of planets with life that actually go on to develop intelligent life (civilizations)\nfc = the fraction of civilizations that develop a technology that releases detectable signs of their existence into space\nL = the length of time for which such civilizations release detectable signals into space[5][6]\n\nEDIT: Doesn't have to be one dataset, could be a few, just need a large collection of a few of these variables at least!", "url": "https://www.reddit.com/r/datasets/comments/bf6xq1/looking_for_a_dataset_with_the_variables_of_the/" }, { "description": "I'm looking for a database of commonly installed Windows software. At minimum I need the name of the software and the executable name, but it'd also be nice to have the publisher and the common installation path, etc. Basically, I'd like to be able to query it to find all the software by Adobe and the associated executable name, etc.\nBasically I'm looking to be able to do\n\nSELECT * FROM Software WHERE Publisher = 'Microsoft' \nSELECT * FROM Software WHERE Executable = 'devenv.com'\n\nI came across an effort to create such a database a long time ago, but can't seem to find it now. Maybe it fizzled out. Any help would be greatly appreciated. Thanks.", "url": "https://stackoverflow.com/questions/223070" }, { "description": "I'm looking for government datasets on a wide variety of topics concerning the population of the country. Things like the crime rate, unemployment rate, level of educational attainment by different segments of the population, median income, etc. I would also like this for a large range of years and also on a national, state, and county level.\nI know, I know this is a crazy dataset which is very ideal, and I will probably have to cobble it together from multiple other datasets. I've already gone through the FBI API, BLS API, and Census API. They range in quality, but they are interesting and have lots of data relevant to the task at end. At some point I suspect I may also have to move on from the APIs on these sites and start doing some manually downloading of files or create some sort of scraper.\nReally though, what I'd like, and what I have not encountered yet is a data format where the demographics are included in column format. What do I mean by this?\nWell... when I query the BLS API, I provide a specific variable ID, say I'm looking for the unemployment rate for Male 16-19 year olds. I get my result. It will have one column for the date, one column for the variable id, and one column for the value. IDEALLY, it would be AMAZING, if it also had a \"sex\" column and an \"age\" column. It would be so amazing to have a dataset like this! Instead of having to know what the specific variable ID was, I could just filter on whatever demographics it was I wanted.\nCurrently, I am building this \"ideal\" dataset by hand so to speak. It's very slow going. So I'm curious if anyone here knew of datasets formatted like I am talking about. I hope this was clear.", "url": "https://opendata.stackexchange.com/questions/12157" }, { "description": "I'm looking for home sale data that includes info such as address, sale price, home type, sq. ft., etc. Where (what dataset(s)) can I get that data from? Thanks.", "url": "https://opendata.stackexchange.com/questions/4736" }, { "description": "I am developing a method for soccer player detection and labeling in a field from a fixed-view camera. I wonder if there exist a dataset that contains several images for each player across different frames. The ideal dataset would have a few hundreds (if not thousands) images of only maximum a few dozen players from a top-view, with each image file name containing the name of the player. The image would be of course quite low-resolution, being originally cropped from a larger view of the football field.\nAny idea if such or similar datasets exists? (I would like to avoid the tedious task of creating one myself)", "url": "https://opendata.stackexchange.com/questions/4545" }, { "description": "I'm looking for RGB-D (D is depth) datasets of images and videos for object detection and recognition tasks similar to the ones provided by PASCAL VOC or ILSVRC (but augmented with depth data). Please help.", "url": "https://stackoverflow.com/questions/44334724" }, { "description": "I'm looking for some data to create lookup tables with. Specifically, all the counties in each state in the US, and all the cities in each county.\nWhere might I find municipal datasets like this?\nEDIT: I'm looking at census.gov and this appears like it may be the ticket.", "url": "https://stackoverflow.com/questions/764693" }, { "description": "I'm looking for the NTU-HD dataset (a hand gesture dataset; each gesture consists of a color image and the corresponding depth map).\nThe reference article is Robust hand gesture recognition based on finger-earth mover's distance with a commodity depth camera.\nNTU = Nanyang Technical University, Singapore", "url": "https://opendata.stackexchange.com/questions/11889" }, { "description": "I'm looking to do some text analysis on complaints or requests filed at cities or municipalities.\nThe data must have:\n\ntype of complaint (e.g. \"Noise complaint\")\ncomplaint text (e.g.: \"The people in the building across from me are having a party\")\n\nIt would also be nice if there's more meta-data, such as:\n\ntime of complaint\nlocation\nresponsible department (e.g. \"Police\" or \"Fire dept.\")\nresponse (e.g. \"warning issued\")\ntime of resolution\n\nSo far I've found various 311 datasets (such as the New York one), but none of them contain the actual complaint text.\nPreferably the data would be in English, Dutch or German, but I'll take what I can get.", "url": "https://opendata.stackexchange.com/questions/11463" }, { "description": "I'm planning on working on a project at an upcoming event. I want to translate sign language (ASL) on the fly/teach a computer to translate ASL. I have a cousin who has taken a few ASL classes so I'll have him help verify, but are there any resources for images/gifs/videos of letters and common gestures?\nI would be attaching a camera to record the user signing and then translate it.", "url": "https://opendata.stackexchange.com/questions/10971" }, { "description": "I'm planning to make a movie genre classifier based on movie plots. I'm looking for a database which includes movies plots and genres. I tried to get IMDB's own dataset but it's paid also doesn't have movie plots. Any idea would be appreciated, thanks for help.", "url": "https://stackoverflow.com/questions/46875469" }, { "description": "I'm working on a computer vision project involving satellite/aerial imagery, and I'm having trouble finding the kind of labelled data I need. So far, I've come across datasets like:\n\nhttps://archive.ics.uci.edu/ml/datasets/Statlog+(Landsat+Satellite)\nhttps://aws.amazon.com/public-data-sets/spacenet/\n... (can't post more links)\n\nwhich contain labels like building footprints or soil type. I'm looking for data that's annotated with features like rivers, forests, deserts, lakes, stuff like that. I'm not sure this is out there, but thought I would ask. Thanks!", "url": "https://opendata.stackexchange.com/questions/9964" }, { "description": "I have downloaded an OpenStreetMap data. This data contain useful information like spatial coordinates and the textual description of many objects. For example, the hotel Park Hyatt at coordinates (25.2435938, 55.3328381) is a spatial object stored in OpenStreetMap dataset.\nI have to associate the spatial object (i.e. Park Hyatt) to a user review. It can be a user evaluation (i.e. \"good\" or \"bad\") or a user rating (like 5/10).\nSo, I am looking for a public dataset with user reviews/ratings of places stored in OpenStreetMap.\nUntil now, I have found the Google Places API. I can solve my problem using it, but it has access limits. I will need some time to obtain all ratings I need. So, I am wondering if already exists any dataset with user ratings to OpenStreetMap Objects.", "url": "https://stackoverflow.com/questions/45200062" }, { "description": "I'm looking for datasets that contain violent crime stats involving guns. I've googled but have only found organized and graphed the data and I woulda prefer raw data that I can aggregate.\nThanks", "url": "https://opendata.stackexchange.com/questions/6594" }, { "description": "I'm working on Social Network mining project, and I'm looking for a \"real social network dataset\" (comments, ,comments on comment, likes, friendship, interest, feeling, places,liked pages, published photos, videos, posts, hashtags anything more is positive )\nI searched a lot, but all available networks are just about nodes and edges (like A follow B). For example\nhttp://snap.stanford.edu/ I search twitter, but its not open because of privacy terms http://an.kaist.ac.kr/traces/WWW2010.html\nAnyone have a suggestion for a dataset?", "url": "https://stackoverflow.com/questions/27324429" }, { "description": "Im looking for 2d floor plan dataset for annotation, but cannot find the big dataset. I already found site 2d floor plan dataset, but here does not have a lot dataset. Will be good if find around 500 of 2d floor images(.jpg) dataset", "url": "https://stackoverflow.com/questions/51375993" }, { "description": "In the article Vehicle Detection in Aerial Images the authors used a dataset of vehicles from aerial view called itcvd. I could not find any information on it online.\nHow can I obtain more information on it or download it?", "url": "https://opendata.stackexchange.com/questions/12868" }, { "description": "I'm looking for a Lorem Ipsum Credit Card Dataset. I'd like it to resemble legitimate data, but want it to be junk data. No real numbers, no real names or addresses. Fake info for demonstration purposes to a classroom of students.\nPreferably at least 300 entries, or so, if possible.", "url": "https://www.reddit.com/r/datasets/comments/am4cpl/looking_for_a_lorem_ipsum_credit_card_dataset/" }, { "description": "Hi,\nI am working on a project for which I would need a richly featured product dataset. Would it be possible to download the catalog of Amazon or Walmart for example? Including the pictures, product description, category and dimensions meta-data etc.\n\nThe closest I've found is the Brazilian E-Commerce Public Dataset by Olist on kaggle.\nWhile there is weight and dimension information, the dataset seems to be more concerned with the product mix at an order level.\nSpecifically, the product description and photo is missing from the product dataset which is what I am interested in.", "url": "https://www.reddit.com/r/datasets/comments/9wc0sz/looking_for_a_rich_ecommerce_product_dataset/" }, { "description": "I am looking for a weather dataset with solar radiance and solar energy production data for same location. I did some digging around and found Open power system data but the weather dataset has temperature, wind and radiation data. I am looking for more parameters of weather like dew point, etc. If there are some resources please share I would be very grateful.", "url": "https://www.reddit.com/r/datasets/comments/ap4jxn/looking_for_a_weather_dataset_with_solar_radiance/" }, { "description": "Hello everyone !\nI am looking for datasets with related info with bananas or sunflowers crops along the world.\nSpecifically, I am searching:\n\nProduction data (most important)\nFenology\nIrrigation\nType of ground\nLocalization\n\nI know that this is too much to ask but I have faith in reddit and its users\n\nThanks!", "url": "https://www.reddit.com/r/datasets/comments/9g3qgs/looking_for_crops_datasets_bananas_and_sunflowers/" }, { "description": "I'm looking for datasets about social media usage by country and age. I'm gonna need datasets for facebook, twitter and Instagram for years 2012-2017. So far google has not been helpful. Thanks", "url": "https://www.reddit.com/r/datasets/comments/74p6j4/looking_for_datasets_about_social_media_by/" }, { "description": "I am testing computer vision algorithms for image categorization. I would like to find a dataset with a few categories of objects e.g. cats and dogs. This dataset should have all the variability within each class be due to the class's intrinsic variability. That is, I don't want to have to worry about pictures taken from different viewpoints or under different lighting conditions etc. Almost all the variability within a category should be due to the intrinsic variability of that category e.g. the category of cats would contain many different images because cats actually look different from one another, not because the images were produced under different conditions.\nPreferably, the objects will be \"cut out\" (on a uniform background). The size of the dataset is not important. Synthetic images (perhaps produced with 3D graphics software) are also ok. The images need to come labeled with their category.\nDoes anyone know of a dataset like this?", "url": "https://stackoverflow.com/questions/5474445" }, { "description": "Looking for an open-source Thesaurus dataset which contains as many English root words and synonyms as possible. Any solutions and associated links to data would be appreciated.", "url": "https://stackoverflow.com/questions/5618304" }, { "description": "Most website only provide historical actual weather data (e.g. the observed temperature on March 5 2015 was ...). I am specifically looking for forecasted weather data (e.g. the forecasted temperature on March 5 2015 for the next day was...).\nAnybody who came across such datasets? The geographical area I am looking for is The Netherlands.\nMany thanks.", "url": "https://opendata.stackexchange.com/questions/12487" }, { "description": "Looking for massachusetts childcares. data format can be in any spatial format(shapefile,kml,geojson) or a csv/xlsx etc... as long as it has addresses or coordinates", "url": "https://opendata.stackexchange.com/questions/10562" }, { "description": "Looking for public datasets with positive and negative sentiment labels.", "url": "https://stackoverflow.com/questions/55707651" }, { "description": "Hi all,\nI'm looking for a meeting dataset containing annotations like \"action items\", \"decisions\", \"reminder\" and labels similar to them.\nI know AMI and ICSI has meeting transcripts but they don't have such annotation which I've mentioned above.\nMicrosoft seems to have build the dataset on their own as mentioned in their paper, (https://www.microsoft.com/en-us/research/wp-content/uploads/2016/06/ActionableItem_camera-ready-1.pdf) they've also provided the link to dataset in that paper, but I couldn't find the dataset there.\nCould anyone of you help me in finding the dataset which has the above mentioned labels ?", "url": "https://www.reddit.com/r/datasets/comments/bevwx8/looking_for_datasets_containing_action_items_in/" }, { "description": "My task isn't simple but I want it to be simple and effcient. I am looking for dataset which has brand specific messages. For example: Amazon is a brand who sends ecommerce messages to users, I want these messages so that I can generate NLP model based on that and use that inside my SMS app to recongize thesse type of messages so that I can build specific UI cards based on that. So I want these messages for all the brands available. There are lot of brands out there and I want all of their transactional SMS messages which they used to communicate with their customers so that i can build my NLP model.Right now I am gathering data form my friends,asking for their inboxes so i cna get those messages, then clean those messages and collect only transactional messages available for brands.Is anyone know how to collect SMS dataset for all the brands available, specifically their transactional messages, any simple and efficient way???", "url": "https://stackoverflow.com/questions/55454088" }, { "description": "I need a data set which contains reviews (comments) about people (eg: doctors / lecturers / politicians) for my project from where can I get that type of data set?", "url": "https://www.reddit.com/r/datasets/comments/85qnt8/looking_for_a_dataset/" }, { "description": "For music data in audio format, there's The Million Song Dataset (http://labrosa.ee.columbia.edu/millionsong/), for example. Is there a similar one for music in symbolic form (that is, where the notes - not the sound - is stored)? Any format (like MIDI or MusicXML) would be fine.", "url": "https://stackoverflow.com/questions/5384695" }, { "description": "Specifically, I'm looking for data on the number of teachers belonging to a union for each county in Florida.", "url": "https://opendata.stackexchange.com/questions/10982" }, { "description": "I am looking for data containing authentication pattern along with the time. This is basically authentication attempts with timestamp. An authentication attempt could be either logging in to account with password, or register new account with password.\nTo start with, the dataset should at least contain time, which is the time when an attempt has been made.\nIdeally, it could also contain e.g. the application of each attempt made to, or the password used in each attempt.\nDue to privacy issue, we could also not record password exactly, but differentiate different passwords using id. For example, the ideal data could look like this\n\ntime id\n9:27 2/12/2015 1\n12:27 2/13/2015 1\n14:00 2/13/2015 2\n\nIn this example, this person make one authentication attempt with one password around 9am on Feb. 12, and another attempt with the same password at 12pm, Feb. 13. Then, he/she make one more attempt with another password around 2pm on Feb. 13.", "url": "https://opendata.stackexchange.com/questions/4543" }, { "description": "We are having a school project on a mathematical model of tumor growth. We would like to compare our model to the real data. Despite a lot of search we couldn't find any open data on the web. Hence, can anyone recommend a resource for the datasets of tumor cells growth?\nEDIT: It seems that we rather need a dataset from test labs, like tumor growth data in mice, rats etc. The reason for that is that we need a data of unrestricted tumor growth. The data from humans is most of the times affected by different treatments to prevent the growth of the cells. Please correct me if I'm wrong, because I'm not a biologist, I'm a mathematician.", "url": "https://opendata.stackexchange.com/questions/4002" }, { "description": "We have created a Handwritten Character Recognition system and now want to test the system on English characters (both digits and alphabets). For digits, we have performed our testing on MNIST data set. However for the English alphabets we have not been able to find any openly available (i.e. available for free), standard data set. All we have been able to find is NIST and CEDAR handwritten data sets, available on their respective official sites, but they come at a substantial cost.\nIs there any other, openly available standard data set of English alphabets which can be used for testing the Handwritten Character Recognition system.", "url": "https://stackoverflow.com/questions/17296140" }, { "description": "I'm looking for datasets allowing one to map names (given names, family names) to geographical location. So far, the best source I've found are Wiki/DBpedia with ~ 1 million records, and Web Of Science Author names and geolocation of their affiliation.\nIs there any other source I could use someone can think of ?\nThanks !", "url": "https://opendata.stackexchange.com/questions/10278" }, { "description": "I am looking for a huge list(csv?) which includes all countries, cities,villages, airports, cafes, etc, for a project that requires structured places for easy search.\nAnd alongside each place, it would be nice to have the country, city, gps coordinates, and anything else.\nDoes anyone know of such a dataset I could buy/download?", "url": "https://www.reddit.com/r/datasets/comments/7ojju9/looking_for_datasets_of_locationsplaces/" }, { "description": "I am looking for datasets of professional American English sports commentary such as that of John Madden. It can be any sport as long as it is meets the criteria I mentioned. It does not necessarily have to be text data as I can have it transcribed but text is preferable. I have found a number of youtube channels that post full NFL, NBA, MLB, etc. games, but if anyone knows of any more centralized data sources for that kind of thing, that would be great. Thanks!", "url": "https://www.reddit.com/r/datasets/comments/auoku7/looking_for_datasets_of_sports_commentary/" }, { "description": "I'm working on a project for grad-school. I wanted one of my variables to be black perceptions of racism/racial tension as a form of stress. I know there is information along this line because of recent studies that use such a variable to analyze maternal mortality rates among black women. So far I've been unsuccessful finding a relevant dataset. Any suggestions about where I would find such information would be incredibly helpful. Thanks!", "url": "https://www.reddit.com/r/datasets/comments/b7ygtz/looking_for_datasets_on_perceived/" }, { "description": "Hey guys I'm working on a project for school and I'd like to analyze sales of delivery food on weekends after the bigger Netflix releases. Does anyone know how I could get a hold of that? I'm going to try emailing a few companies but that might take a whilen to get info. Basically as long as it's delivery data by day, or even week if it comes to that that.", "url": "https://www.reddit.com/r/datasets/comments/awz0g8/looking_for_datasets_on_pizzavor_other_delivery/" }, { "description": "Hi y'all,\nI'm looking for a dataset that has time-of-login or time-of-arrival data for any museum, national park, or any other location. The ideal set would be simply a list of date-times and the associated location. I could also work with data for a website (time-of-visit) but it would be less ideal.\nI've found national park data to the month and sometimes even the day but nothing so far with time-of-day as a field.\nThanks in advance!", "url": "https://www.reddit.com/r/datasets/comments/7xc9f9/looking_for_timestamped_dataset/" }, { "description": "I'm looking for a dataset of illegal drugs consumption through time in the USA, more precisely in each USA state.", "url": "https://www.reddit.com/r/datasets/comments/amf0ae/looking_for_dataset_about_illegal_drugs_over_time/" }, { "description": "Hi,\nI am looking for datasets of Java projects which display plagiarism; from direct copying to more significant obfuscation, e.g. changing names, rearranging structure, and modifying inconsequential fields. The purpose is to train a plagiarism detection and classification algorithm. Any help would be greatly appreciated.\nThanks.", "url": "https://www.reddit.com/r/datasets/comments/ax6bil/looking_for_dataset_comprising_of_plagiarized/" }, { "description": "Simply looking for a dataset that has books and features of those books. For the purpose of creating a recommendation model. If you guys know of a service that already does this that would be neat too!", "url": "https://www.reddit.com/r/datasets/comments/6knvqn/looking_for_dataset_for_books/" }, { "description": "I am looking for a free-access dataset of images of full human bodies, pref front and side shots with no objects blocking the human. Do you guys have any idea of where I can look to find such a dataset? I tried google but couldn't find what I wanted. Thank you very much Clarification: Pictures of people just standing around (clothes on) with their heights given.", "url": "https://www.reddit.com/r/datasets/comments/970zg1/looking_for_dataset_for_images_of_full_human/" }, { "description": "I'm looking for datasets with nutrition data for many commercial food products (i.e. Lucky Charms, Monster Energy, Nutella, etc). I'm just thinking of the table on the majority of product packaging, usually it's called Nutrition Facts or Supplement Facts. Ideally I'd want all types of food but a dataset of just cereals or just drinks would be cool as well.", "url": "https://www.reddit.com/r/datasets/comments/20zih2/request_nutrition_data_for_many_commercial/" }, { "description": "I'm doing a CRM POC and need a sample/anonymized dataset of businesses or consumers along with details on the purchaser, items , what's purchased, and if possible leads or opportunities. Doesn't have to be super large, but just a nice cross section of data to help us test systems and view how well they work with a variety of data.\n\nThanks ...", "url": "https://www.reddit.com/r/datasets/comments/b474as/looking_for_dataset_of_customerbusiness_purchases/" }, { "description": "Where can I find large sample files with multidimensional data?\nI'm interesting in both categorical and mostly quantitative data.\nAnything from spatio-temporal weather simulations to health studies or preferably anything with 20 variables.\nI want to try different visualization tools and I need data to test them.", "url": "https://opendata.stackexchange.com/questions/4656" }, { "description": "I'm trying to implement PageRank algorithm on a set of web pages, for that I need a sample dataset of web pages, and the web graph corresponding to them, this web graph represents the links between the pages that the data set contains.\nI need the web graph so I can get the transition matrix and do the calculation needed. Example:\n\nURL1 -> URL2\nURL3390 -> URL5\nURLxxxx is an id, somehow mapped to the corresponding web page\n\nMy question is: how/where can I get this resource (I've tried many links on the internet but nothing really helps), I would also like it to be not of a very large size, (internet connection limitation), if I can't have this as it is, could sou give me some advice on what I should do?\nUpdate: for people who may consider this off topic, and they may be right, networks like Software Recommendation or on Computer Science, don't even have corresponding tags, and doesn't really fit the kind of this question, I appreciate your help.", "url": "https://stackoverflow.com/questions/23376840" }, { "description": "Hello, I am looking for a dataset that includes every area code in the United States, and the major city/state matching each values. For example, if you go to this site (here ) and search for an area code, it returns the major city associated with it. I am creating an application that will include this feature (site mentioned earlier does not allow redistribution), so I will need the ability to redistribute it, please. I have found this dataset (here );however, I contacted the site and they said it's extremely old/out of date. I appreciate any help, thank you very much!", "url": "https://www.reddit.com/r/datasets/comments/5euqzz/looking_for_dataset_of_major_citystate_per_area/" }, { "description": "I'm looking for something similar to translation datasets where there are pairs of sentences in two different languages that mean the same thing, except programming languages instead of natural languages and code instead of sentences.", "url": "https://www.reddit.com/r/datasets/comments/7snzye/looking_for_dataset_of_the_same_programs_written/" }, { "description": "Looking for a data set on graduation rates by degree field for as far back as the 60s if possible. Would like to have it nationally as well as state by state if possible. Looking to see how certain degrees have gained or lost popularity over time.", "url": "https://www.reddit.com/r/datasets/comments/795dxi/looking_for_dataset_on_graduation_rates_by_degree/" }, { "description": "I am looking for a dataset (free or paid) that has human body size measurements that includes: shoulder width, arm length, bust/chest circumference, waist circumference, hips, inseam, height and neck size. It doesn't have to contain ALL of this info, but the more the better. Slightly related, is there a free/paid service out there that you can contact a large group of people to survey for such data? Thank you.\nI have found the FSMA anthropometric dataset online but it's not quite what I'm looking for.", "url": "https://www.reddit.com/r/datasets/comments/9b1bvz/looking_for_dataset_on_human_body_measurements/" }, { "description": "Trying to find a dataset on median household income/wages. Preferably over the last 20 years but less is okay too. US and Japan interest me most.", "url": "https://www.reddit.com/r/datasets/comments/8q5hwb/looking_for_dataset_on_median_household/" }, { "description": "I'm looking for a dataset with rates of teachers leaving the profession. It would be even better if the rates were broken down by type of teacher.", "url": "https://www.reddit.com/r/datasets/comments/994m2d/looking_for_dataset_with_rates_of_teachers/" }, { "description": "Hi, I'm new here, but am desperately looking for a dataset that has mapped (preferably IT) projects and labeled them as a success or failure. My goal is to predict whether a project will succeed or fail based on the project metrics such as budget, time, scope, etc.\nI have already looked online, but have not been able to find any yet.\n\nHopefully you guys can help me out. Thanks in advance!", "url": "https://www.reddit.com/r/datasets/comments/bgznrc/looking_for_it_project_successfailure_dataset/" }, { "description": "Hi, I'm new to data science in general and just got into a college project regarding data ming and NLP. I am looking for public datasets containing inpatient disease/diagnosis history. I intend to study the uses of data mining in analysing the risks of future ocurrences of diseases based on the previous ocurrences. Any advice would be appreciated.", "url": "https://www.reddit.com/r/datasets/comments/bbmw88/looking_for_medical_datasets/" }, { "description": "For my research, I'm looking for a dataset with images of parking lots taken from eye level.\nAll I've found so far is datasets with elevated CCTV style images, whereas what I need is more along the lines of google street view.\nDoes anyone know where I could find this kind of dataset?", "url": "https://www.reddit.com/r/datasets/comments/aro6og/looking_for_parking_lot_dataset/" }, { "description": "I am looking for jpg images for all unicode characters. I spent some time googling around and surprisingly had a difficult time finding a dataset. Does anyone know of such a dataset or is there anyway I can just generate it? I figure that should be possible, but I am unsure how to go about doing it.", "url": "https://www.reddit.com/r/datasets/comments/bbal71/looking_for_unicode_dataset_or_way_to_create_one/" }, { "description": "Hi all, I'm looking for datasets which contain information on public opinion on different policy issues (preferably for the UK). This is for my MSc dissertation in which I'll be looking at the extent to which opinions on policy affect voting behaviour.\nI've looked at a few things such as the British Social Attitudes Survey and the British Election Study, and I'm planning on using data from the BES Face to Face survey, which has more relevant variables than the BES panel surveys.\nI'm just wondering if anyone has any other suggestions on what might be useful. Cheers!", "url": "https://www.reddit.com/r/datasets/comments/bcvyjo/public_opinion_towards_policy/" }, { "description": "I'm looking for dataset about latency in games. I'm interested in logs from matchmaker where I'd get the list of IP and their respective latency. Dataset can be old, this is not to DDoS anyone, I'm interested in the latency-ip relationship. This is to improve my software.\nIf you know where I could get that please let me know. thanks.", "url": "https://www.reddit.com/r/datasets/comments/bblru9/request_dataset_about_multiplayer_lag_matchmaker/" }, { "description": "Dear All\n\nI am looking for dataset which consists of tweets or blogs or new articles or wikipedia articles which are related to football players and they either talk about some controversy or they have started a controvercy.\n\nRegards\nNDS", "url": "https://www.reddit.com/r/datasets/comments/avrdag/request_for_a_dataset_of_controversial/" }, { "description": "Hey,\nI'm looking for dataset preferably connected with sales/inventory specific to learn/tryout Power BI & SSRS, therefore I would appreciate if dataset would be uploadable to SQL Server database.", "url": "https://www.reddit.com/r/datasets/comments/5mf5ye/request_looking_for_sales_dataset_to_play_inside/" }, { "description": "I am looking for dataset which contains marks of students (of different schools and boards) in their final years at high school and first years at a particular university.", "url": "https://www.reddit.com/r/datasets/comments/b2vs4d/students_school_and_college_marks/" }, { "description": "I would like to request UK Life Sciences company data that includes number of related companies, turnover and employment in the aforementioned industry sector. Specifically, these industries are: Pharmaceutical, Medical, Technology, Medical Biotechnology and Industrial Biotechnology. These data will be used for my research to create new insight for business based on company's data within a specific industry.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I would like versions of the Input-Output Supply and Use Tables from 1980-1992. The relevant links at http://data.gov.uk/dataset/input-output_supply_and_use_tables are broken, and not all of the years are even ostensibly available.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I am currently doing research for my final year dissertation. I have decided to base it on the effect of changes in management to organisational culture within the NHS. Is there a database which will provide me with a list of names of people who have been in NHS management within the past 10 years or so?", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I am looking for data on use of public sector owned leisure centres and on activities used. e.g. number of users per day, week, month, times in the day (peaks); and what they did (swim, table tennis, zumba). I am surprised at the lack of free public data available on this subject, when the industry is very data rich and market led.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "A copy of the Broad Rental Market Areas (BRMA) geographic boundaries in a useable digital format", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Hello,\r\rIf possible i would like a list of business's which employ over 1,000 people within the Yorkshire region.\r\rMany thanks.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "MoT failure rates on passsenger cars - by car marque and by region. This information has been releases for the years 2008, 2009, 2010. Do you have the data available for 2011/2012/2013?", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Looking to find office buildings in Central London and Greater London that house 200 or more people in the office location. The 200 number can be comprised of many businesses located in one office address or a single business that houses 200 people at the location.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I need datasets for bicycle accidents on London Roads.I am study at London South bank university, i need this information for my final year Project", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Number of deaths in the last 20 years by any cause of death. Counts are required by age at death", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I need data for the England, Scotland, Wales similar to this below for NI that displays the water hardness classification and the geo-json boundaries", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Generalised land use database for the whole London for the year of 2013.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Modelled background pollution data for pm10 for at a high resolution for use as a raster map in GIS across the UK. Starting with January 2011 and ending in as close to the present day as possible. To be use at least at the local authority level and preferably using x, y coordinates or OS grid references. Same as data here/similar (http://uk-air.defra.gov.uk/data/pcm-data) but at the monthly timeframe level.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "A location database which holds all ATM locations within the UK. The dataset would need to contain physical location (i.e. lat/long) and (if possible) the type of ATM (Free/Surcharge) and the owner (HSBC, NatWest). It looks like this dataset already exists for Sunderland.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Hourly (or less) resolution rainfall records for three 5 week periods commencing 25th May 2007, 26th December 2007 & 6th June 2012, recorded at or from the nearest observation station to DN12 South Yorkshire.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I am challenging my council tax banding and need to see what properties like mine were selling for in 1991.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I would like to access an up to date record of ALL council tax bands for every council.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Can I have the air quality data for Nottingham from Monday 30th March till the present day or most recent data available. This is following a large fire in Nottingham on Players St, Radford. If possible I would like to see over all air quality before the incident, and decline during the period stated. Also and any chemical identification that has been identified would be useful.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Hello. I would like any house sale prices for postcode TW5 9DJ and TW5 9DG between April 1989 and April 1993. I need to have evidence of these for challenging my current current council tax band.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Number of deaths in the last 20 years where cause is death has been certified as cancer. Counts are required by age of death", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "This UK data-set would show the amount of deaths annually including the mode of disposal, whether cremation, burial, water, natural or some other.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I would like to be able to access the database of live traffic camera images from the Highways Agency. I would like a list of camera locations and URLs to the images. I believe this data has been available freely in the past, however I have tried and failed to gain access recently.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Hourly or sub hourly Traffic count, vehicles speeds, and vehicle classifications related to Road sites air pollution data provided by London air quality Network on its website", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I need the annual time-series daily traffic flow of birmingham for research", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Some sort of analysis of wealth (not deprivation) at small area (OA level for preference). This could be something like average gross taxable income, average untaxed state benefits and any other useful additions", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "In your updates of scarlet fever notifications in the UK, graphs are produced displaying data from 2009/2010 to 2014/2015.\r\rPlease could I have this data to analyse for a University Project?", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "For my undergraduate dissertation at the University of East Anglia I intend to estimate the potential of dredging as a flood control measure for the Parrett Catchment. I have obtained daily precipitation data for the south-west and have created a GIS model to estimate potential discharge of the catchment. In order to validate my model I would require regular interval (e.g. daily) water depth or discharge data from a number of stations (ideally > 10) across the catchment. I am most interested in the period of Autumn 2013 to Spring 2014, should this period be available.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I would like flow data for the River Ribble. I am interested in gauging station and level-only data for three stations: Locks Weir (Stackhouse), Penny Bridge (Giggleswick) & Stainforth (if it still exists). I would like data (cumecs and/or river height) to be as detailed (shortest interval between measurements) as possible for the period 25th April 2015 to 13th May 2015. I appreciate the most recent data may require validation before it can be released.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "We require a dataset that models airborne salination across the UK. This must be a gridded dataset with extensive or complete land coverage for the UK. The dataset may model either sodium chloride particulates or chloride particulates and may be measured either as a mean background pollution or as an accumulated annual particulate deposition.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "A national record of all water related fatalities and incidents in the UK.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "For my Research Based Learning project Titled Multi Target Prediction In Drug Design I want the Data Set\rwhich will contain the details about the chemical compounds and their possible legends.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I would like an up to date listing of all the Academy Trusts with member schools of each trust (Primary and Secondary) For each trust the CEO ( trust leader)/ address. For each of the schools under the Academy trust - the Head teacher, address, number of students, age range, telephone number.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": " A list by post code and identifier of all low bridges in UK", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I would like for my PHD research to have rail passenger throughput numbers for a few point to point journeys covering 1 year (any UK point to point journey) . For example the numbers of passengers in one year that travel from Kingston to Guildford. The data is required as an input to perform some calculations.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Statisitics on Recruitment by Conscription to the UK Armed Forces in 1948 and annually until 1952", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I am working on an undergraduate research thesis comparing the United States and the United Kingdom, their immigration policies, and appeal to immigrants. I am wondering if I could receive data regarding naturalisations in the UK per year for years 2004-2013, by region, and by country of birth.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I am looking for either monthly or quarterly or annual Commercial and Industrial waste management data. Period 2007-2013. I see that this data is published only for 2004, 2006 and 2008. Is it possible to get the data until 2013 or the most latest updated period?\rRegards,\rAkhil", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Driver Vehicle Licensing Agency (DVLA) vehicles database. List of all licensed and unlicensed vehicles registered in GB. More than 80 million records. Includes the vehicle registration number plate.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Hi - is there a data set/data sets which included the information described above? I am looking to build up a picture of the patterns of numbers in care across different local authorities, at a particular point in time (although I am not too concerned about when that point in time is). Ideally I'd like to obtain the numbers in care and the amount the authority spends per year on such care, although either of these figures would be helpful on its own.\r\rI have searched the site but have not been able to find anything along the lines of the above.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Weather report, in particular rainfall in West Street, Glasgow for the period 5 April 2007 to September 2008.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "A full, national list of Unique Street Reference Numbers (USRNs) together with their associated metadata (geographic coordinates, status and classification information)", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Request annual return data on total numbers of Sheep & Lambs and Cattle & Calves in the following two North Yorkshire parishes from 1986 to the latest available date: for Malham Moor Parish and for Buckden Parish", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "For each GP surgery in the UK the data would list the contract type it holds (eg APMS, GMS, PMS) and the signatories to the contract. This would be the GP partners who own the surgery, or the parent company / corporate provider who provides primary care services.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "A breakdown of children in Newcastle East End who are inactive (less than 30mins per week) across the city wards of Byker, Heaton, Walker and Walkergate.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I would like to have the monthly fertility rate for UK which includes England, Wales, Scotland and North Ireland", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I would like a list of BSF and PFI schools opened in the last 10 years.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I can't seem to find any data on police spending over �500 (broken down by police force), I would like to see similar spend data to the sets which should be published for local councils.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "Data about Road Safety Data from 2014 to 2015 and daily or monthly refresh.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "I am running a data analysis of our marine staff pay to ensure that our pay scales are still competitive. I am looking at the following roles: masters, chief engineers, chief officers, 2nd and 3rd officer, 2nd and 3rd negineers, purser, chefs, bosum and able seamen.", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" }, { "description": "School catchment area boundaries in a useful location format for schools across the whole country (or even across the UK, if such data is available)", "url": "https://github.com/chabrowa/data-requests-query-dataset/blob/master/datarequests.csv" } ] }