Collection | Software, E-Resource Simple English Wikipedia. Simplewiki
More Resources
About this Item
- Title
- Simple English Wikipedia.
- Other Title
- Simplewiki
- Summary
- The dataset is composed of the content of Simple Wikipedia including articles and revision history in XML. The XML dumps are in a Export format and compressed in bzip2 and .7z formats; while SQL dumps are in mysqldump https://meta.wikimedia.org/wiki/Data_dumps External
- Contributor Names
- Wikimedia Foundation, publisher.
- Created / Published
- [San Francisco, CA] : Wikimedia Foundation
- Contents
- Articles, templates, media/file descriptions, and primary meta-pages -- All pages with complete edit history -- All pages with complete page edit history -- Log events to all pages and users -- All pages, current versions only -- First-pass for page XML data dumps -- Extracted page abstracts for Yahoo.
- Subject Headings
- - Electronic encyclopedias
- Genre
- Data sets
- Notes
- - Website for dataset launched November 17, 2003.
- - "This is the front page of the Simple English Wikipedia. Wikipedias are places where people work together to write encyclopedias in different languages. We use Simple English words and grammar here. The Simple English Wikipedia is for everyone! That includes children and adults who are learning English." - website home page.
- - "A complete copy of all Wikimedia wikis, in the form of wikitext source and metadata embedded in XML. A number of raw database tables in SQL form are also available. These snapshots are provided at the very least monthly and usually twice a month." - Wikimedia Downloads Database backup dumps page.
- - First downloaded by the Library of Congress on January 23, 2019.
- - Title from website home page (viewed May 8, 2019).
- Medium
- textual datasets
- Call Number/Physical Location
- AE5
- Repository
- s-Online Electronic Resource
- Digital Id
- https://simple.wikipedia.org External
- https://dumps.wikimedia.org/backup-index.html External
- https://hdl.loc.gov/loc.gdc/gdcdatasets.2019205402_20190101
- https://hdl.loc.gov/loc.gdc/gdcdatasets.2019205402_20200120
- https://hdl.loc.gov/loc.gdc/gdcdatasets.2019205402_20210101
- https://hdl.loc.gov/loc.gdc/gdcdatasets.2019205402_20220101
- Library of Congress Control Number
- 2019205402
- Rights Advisory
- Creative Commons Attribution-ShareAlike 3.0 United States https://creativecommons.org/licenses/by-sa/3.0/us/ External
- Online Format
- compressed data
- LCCN Permalink
- https://lccn.loc.gov/2019205402
- Additional Metadata Formats
- MARCXML Record
- MODS Record
- Dublin Core Record
Format
Contributors
Dates
Languages
Subjects
Rights & Access
Cite This Item
More Collections like this
-
CollectionEnron email dataset. "This dataset was collected and prepared by the CALO Project (A Cognitive Assistant that Learns and Organizes). It contains data from about 150 users, mostly senior management of Enron, organized into folders....
- Contributor: Enron Corp - United States. Federal Energy Regulatory Commission - Cohen, William W.
- Date: 2015
-
CollectionGrand comics database dataset
GCD A database of creator credits, story details, and other information useful to the readers and fans of comic books, minicomics, and fanzines.- Contributor: Grand Comics Database Project
- Date: 1994
-
CollectionNational Enquirer Index and Database Files, 1977- The National Enquirer Index and Database Files were created by Mike Handy and other retired Library of Congress staff and volunteers to index the microfilm holdings of National Enquirer at the Library...
- Contributor: Handy, Mike
- Date: 1977
-
CollectionMeme Generator : collected datasets.
Meme Generator. | Meme Generator dataset Meme Generator allows users to create and share image macros (featuring a picture, or artwork, superimposed with text) in the style of popular internet memes. The site also serves as a searchable...- Contributor: Library of Congress - American Folklife Center
- Date: 2010
-
CollectionDinosaur comics. This dataset was generated from content harvested from the Library of Congress's web archive of qwantz.com (Dinosaur Comics!): https://www.loc.gov/item/lcwaN0009953/. It includes minimal metadata about 3,325 image objects from the Dinosaur Comics! web...
- Contributor: Library of Congress Web Archiving Program - North, Ryan
- Date: 2019