Git Product home page Git Product logo

gender-guesser's Introduction

Gender Guesser

https://travis-ci.org/lead-ratings/gender-guesser.svg?branch=master

This package uses the underlying data from the program "gender" by Jorg Michael (described here). Its use is pretty straightforward:

>>> import gender_guesser.detector as gender
>>> d = gender.Detector()
>>> print(d.get_gender(u"Bob"))
male
>>> print(d.get_gender(u"Sally"))
female
>>> print(d.get_gender(u"Pauley")) # should be androgynous
andy

The result will be one of unknown (name not found), andy (androgynous), male, female, mostly_male, or mostly_female. The difference between andy and unknown is that the former is found to have the same probability to be male than to be female, while the later means that the name wasn't found in the database.

I18N is fully supported:

>>> print(d.get_gender(u"\xc1lfr\xfan"))  # u"Álfrún"
female

Additionally, you can give preference to specific countries:

>>> print(d.get_gender(u"Jamie"))
mostly_female
>>> print(d.get_gender(u"Jamie", u'great_britain'))
mostly_male

Additionally, you can create a detector that is not case sensitive (default is to be case sensitive):

>>> d = gender.Detector(case_sensitive=False)
>>> print(d.get_gender(u"sally"))
female
>>> print(d.get_gender(u"Sally"))
female

Try to avoid creating many Detectors, as each creation means reading the data file.

Licenses

The generator code is distributed under the GPLv3. The data file nam_dict.txt is released under the GNU Free Documentation License.

Changelog

1.0.0 (2020-09-19)

  • Move completely to Python 3.

0.4.0 (2016-12-05)

  • Fix bug where spaces in composite names were shadowing other names

0.3.0 (2016-07-02)

  • Remove unknown_value init option, since it can be implemented very easily with a wrapper if needed.
  • Return unknown when name is not found and andy when it is valid equally for both male and female.
  • Test README examples as doctests.
  • Fix incorrect country-wise gender detection for non-iso886-15 names coming from line length change after data file conversion to UTF-8. See #gh2. Thanks @miquelcamprodon.

0.2.0 (2015-12-06)

  • Wire in tox to test in both Python 2 and Python 3.
  • Python 2 and 3 compatiblity.
  • Remove obsolete character mapper code.

For previous versions, see sexmachine.

Credits

This is a fork of gender-guesser packagee by lead-ratings <https://github.com/lead-ratings/gender-guesser> which in itself is a fork of the SexMachine package by Ferhat Elmas. It was created to be able to publish a Python 3 compatible version to PyPI and to be able add some more improvements without bugging the original author.

gender-guesser's People

Contributors

dukebody avatar ferhatelmas avatar skytreader avatar dzpm avatar jamesmeneghello avatar arnau126 avatar ludovicl avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.