Git Product home page Git Product logo

Comments (2)

Goddak avatar Goddak commented on June 18, 2024

I have since updated the python and fixed this but am unable to push the changes.

dinosaurs.py

endpoints = ['aardonyx', 'abelisaurus', 'achelousaurus', 'achillobator', 'acrocanthosaurus', 'aegyptosaurus', 'afrovenator', 'agilisaurus', 'alamosaurus', 'albertaceratops', 'albertosaurus', 'alectrosaurus', 'alioramus', 'allosaurus', 'alvarezsaurus', 'amargasaurus', 'ammosaurus', 'ampelosaurus', 'amygdalodon', 'anchiceratops', 'anchisaurus', 'ankylosaurus', 'anserimimus', 'antarctosaurus', 'apatosaurus', 'aragosaurus', 'aralosaurus', 'archaeoceratops', 'archaeopteryx', 'archaeornithomimus', 'argentinosaurus', 'arrhinoceratops', 'atlascopcosaurus', 'aucasaurus', 'austrosaurus', 'avaceratops', 'avimimus', 'bactrosaurus', 'bagaceratops', 'bambiraptor', 'barapasaurus', 'barosaurus', 'baryonyx', 'becklespinax', 'beipiaosaurus', 'bellusaurus', 'borogovia', 'brachiosaurus', 'brachylophosaurus', 'brachytrachelopan', 'buitreraptor', 'camarasaurus', 'camptosaurus', 'carcharodontosaurus', 'carnotaurus', 'caudipteryx', 'cedarpelta', 'centrosaurus', 'ceratosaurus', 'cetiosauriscus', 'cetiosaurus', 'chaoyangsaurus', 'chasmosaurus', 'chindesaurus', 'chinshakiangosaurus', 'chirostenotes', 'chubutisaurus', 'chungkingosaurus', 'citipati', 'coelophysis', 'coelurus', 'coloradisaurus', 'compsognathus', 'conchoraptor', 'confuciusornis', 'corythosaurus', 'cryolophosaurus', 'dacentrurus', 'daspletosaurus', 'datousaurus', 'deinocheirus', 'deinonychus', 'deltadromeus', 'dicraeosaurus', 'dilophosaurus', 'diplodocus', 'dracorex', 'dromaeosaurus', 'dromiceiomimus', 'dryosaurus', 'dryptosaurus', 'dubreuillosaurus', 'edmontonia', 'edmontosaurus', 'einiosaurus', 'elaphrosaurus', 'emausaurus', 'eolambia', 'eoraptor', 'eotyrannus', 'equijubus', 'erketu', 'erlikosaurus', 'euhelopus', 'euoplocephalus', 'europasaurus', 'eustreptospondylus', 'fukuiraptor', 'fukuisaurus', 'gallimimus', 'gargoyleosaurus', 'garudimimus', 'gasosaurus', 'gasparinisaura', 'gastonia', 'giganotosaurus', 'gilmoreosaurus', 'giraffatitan', 'gobisaurus', 'gorgosaurus', 'goyocephale', 'graciliceratops', 'gryposaurus', 'guaibasaurus', 'guanlong', 'hadrosaurus', 'hagryphus', 'haplocanthosaurus', 'harpymimus', 'herrerasaurus', 'hesperosaurus', 'heterodontosaurus', 'heyuannia', 'homalocephale', 'huayangosaurus', 'hylaeosaurus', 'hypacrosaurus', 'hypsilophodon', 'iguanodon', 'indosuchus', 'irritator', 'isisaurus', 'janenschia', 'jaxartosaurus', 'jingshanosaurus', 'jinzhousaurus', 'jobaria', 'juravenator', 'kentrosaurus', 'khaan', 'kotasaurus', 'kritosaurus', 'lambeosaurus', 'lapparentosaurus', 'leaellynasaura', 'leptoceratops', 'lesothosaurus', 'liaoceratops', 'ligabuesaurus', 'liliensternus', 'lophorhothon', 'lophostropheus', 'lufengosaurus', 'lurdusaurus', 'lycorhinus', 'magyarosaurus', 'maiasaura', 'majungasaurus', 'malawisaurus', 'mamenchisaurus', 'mapusaurus', 'marshosaurus', 'masiakasaurus', 'massospondylus', 'maxakalisaurus', 'megalosaurus', 'melanorosaurus', 'metriacanthosaurus', 'microceratus', 'micropachycephalosaurus', 'microraptor', 'minmi', 'monolophosaurus', 'mononykus', 'mussaurus', 'muttaburrasaurus', 'nanshiungosaurus', 'nedoceratops', 'nemegtosaurus', 'neovenator', 'neuquenosaurus', 'nigersaurus', 'nipponosaurus', 'noasaurus', 'nodosaurus', 'nomingia', 'nothronychus', 'nqwebasaurus', 'omeisaurus', 'opisthocoelicaudia', 'ornitholestes', 'ornithomimus', 'orodromeus', 'oryctodromeus', 'othnielia', 'ouranosaurus', 'oviraptor', 'pachycephalosaurus', 'pachyrhinosaurus', 'panoplosaurus', 'pantydraco', 'paralititan', 'parasaurolophus', 'parksosaurus', 'patagosaurus', 'patagotitan', 'pelicanimimus', 'pelorosaurus', 'pentaceratops', 'piatnitzkysaurus', 'pinacosaurus', 'plateosaurus', 'podokesaurus', 'poekilopleuron', 'polacanthus', 'prenocephale', 'probactrosaurus', 'proceratosaurus', 'procompsognathus', 'prosaurolophus', 'protarchaeopteryx', 'protoceratops', 'protohadros', 'psittacosaurus', 'quaesitosaurus', 'rebbachisaurus', 'rhabdodon', 'rhoetosaurus', 'rinchenia', 'riojasaurus', 'rugops', 'saichania', 'saltasaurus', 'sarcosaurus', 'saurolophus', 'sauropelta', 'saurophaganax', 'saurornithoides', 'scelidosaurus', 'scutellosaurus', 'secernosaurus', 'segisaurus', 'segnosaurus', 'shamosaurus', 'shanag', 'shantungosaurus', 'shunosaurus', 'shuvuuia', 'silvisaurus', 'sinocalliopteryx', 'sinornithosaurus', 'sinosauropteryx', 'sinovenator', 'sinraptor', 'sonidosaurus', 'spinosaurus', 'staurikosaurus', 'stegoceras', 'stegosaurus', 'stenopelix', 'struthiomimus', 'struthiosaurus', 'styracosaurus', 'suchomimus', 'supersaurus', 'talarurus', 'tanius', 'tarbosaurus', 'tarchia', 'telmatosaurus', 'tenontosaurus', 'thecodontosaurus', 'therizinosaurus', 'thescelosaurus', 'torosaurus', 'torvosaurus', 'triceratops', 'troodon', 'tsagantegia', 'tsintaosaurus', 'tuojiangosaurus', 'tylocephale', 'tyrannosaurus', 'tyrannotitan', 'udanoceratops', 'unenlagia', 'urbacodon', 'utahraptor', 'valdosaurus', 'velociraptor', 'vulcanodon', 'yandusaurus', 'yangchuanosaurus', 'yimenosaurus', 'yingshanosaurus', 'yinlong', 'yuanmousaurus', 'yunnanosaurus', 'zalmoxes', 'zephyrosaurus', 'zuniceratops']

field-scaper.py

from bs4 import BeautifulSoup
import requests
import dinosaurs as dino_names

f = open("../dinosaurs.js", "w")
f.write("[")

fields_collection = []
model = 'https://www.nhm.ac.uk/discover/dino-directory/'
for name in dino_names.endpoints:
    fields = {
        "name": "",
        "type": "",
		"length": "",
		"weight": "",
        "diet": "",
        "teeth": "",
        "food": "",
        "movement": "",
        "era": "",
        "location": "",
    }
    html = requests.get(model + name + '.html').text
    parsed_html = BeautifulSoup(html, features='html.parser')
    nameContainer = parsed_html.find('h1', 'dinosaur--name dinosaur--name-unhyphenated')
    fields['name'] = nameContainer.contents[0]

    overviewContainer = parsed_html.find('dl', 'dinosaur--description dinosaur--list')
    typeContainer = overviewContainer.find('a')
    fields['type'] = typeContainer.contents[0]
    overviewDataContainer = overviewContainer.find_all('dd')
	# Some dinosaurs don't have documented lengths
    if len(overviewDataContainer) >= 2:
        fields['length'] = overviewDataContainer[1].contents[0]
	# Some dinosaurs don't have documented weights
    if len(overviewDataContainer) >= 3:
        fields['weight'] = overviewDataContainer[2].contents[0]

    detailsContainer = parsed_html.find('dl', 'dinosaur--info dinosaur--list')
    detailsHeadings = detailsContainer.find_all('dt')
    detailsData = detailsContainer.find_all('dd')
    headingMap = {
		"Diet:": "diet",
		"Teeth:": "teeth",
		"Food:": "food",
		"How it moved:": "movement",
		"When it lived:": "era",
		"Found in:": "location",
	}

    for i, heading in enumerate(detailsHeadings):
        mapableHeading = heading.contents[0]
        # if mapableHeading == "When it lived:" or "Diet:" or "Found in:":
        if mapableHeading == "When it lived:":
            fields[headingMap[mapableHeading]] = detailsData[i].find('a').contents[0]
        elif mapableHeading == "Diet:":
            fields[headingMap[mapableHeading]] = detailsData[i].find('a').contents[0]
        elif mapableHeading == "Found in:":
            fields[headingMap[mapableHeading]] = detailsData[i].find('a').contents[0]
        elif mapableHeading in headingMap:
            fields[headingMap[mapableHeading]] = detailsData[i].contents[0]

    print(fields)

    
    f.write(str(fields))
    f.write(',')


f.write(']')
f.close()

If you give me permission I will push and make a PR or feel free to update using the provided code.

p.s. please don't judge my python skills I am a javascript guy 😆

from tyrannosaurus.rest.

Goddak avatar Goddak commented on June 18, 2024

Accidental close - This isn't resolved until the repo is updated

from tyrannosaurus.rest.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.