Git Product home page Git Product logo

php-name-parser's Introduction

PHP-Name-Parser Build Status

PHP library to split names into their respective components. Besides detecting first and last names, this library attempts to handle prefixes, suffixes, initials and compound last names like "Von Fange". It also normalizes prefixes (Mister -> Mr.) and fixes capitalization (JOHN SMITH -> John Smith).

Usage:

// Include the composer autoloader.
include("vendor/autoload.php");

// Import the class.
use joshfraser\FullNameParser;

// Create an instance.
$parser = new FullNameParser();
$parser->parse_name("Mr Anthony R Von Fange III");

// Or call statically.
FullNameParser::parse("Mr Anthony R Von Fange III");

Results:

array(7) {
  ["salutation"]     => string(3) "Mr."
  ["fname"]          => string(7) "Anthony"
  ["initials"]       => string(1) "R"
  ["lname"]          => string(9) "Von Fange"
  ["lname_base"]     => string(5) "Fange"
  ["lname_compound"] => string(3) "Von"
  ["suffix"]         => string(3) "III"
}

The algorithm:

We start by splitting the full name into separate words. We then do a dictionary lookup on the first and last words to see if they are a common prefix or suffix. Next, we take the middle portion of the string (everything minus the prefix & suffix) and look at everything except the last word of that string. We then loop through each of those words concatenating them together to make up the first name. While we’re doing that, we watch for any indication of a compound last name. It turns out that almost every compound last name starts with 1 of 16 prefixes (Von, Van, Vere, etc). If we see one of those prefixes, we break out of the first name loop and move on to concatenating the last name. We handle the capitalization issue by checking for camel-case before uppercasing the first letter of each word and lowercasing everything else. I wrote special cases for periods and dashes. We also have a couple other special cases, like ignoring words in parentheses all-together.

Check examples.php for the test suite and examples of how various name formats are parsed.

Possible improvements

  • Handle the "Lname, Fname" format
  • Separate the parsing of the name from the normalization & capitalization & make those optional
  • Seperate the dictionaries from the code to make it easier to do localization
  • Add common name libraries to allow for things like gender detection

Same logic, different languages

Credits & license:

php-name-parser's People

Contributors

anchepiece avatar atla5 avatar burki avatar gh-o-st avatar jenky avatar jhoughtelin avatar joshfraser avatar krlnwll avatar luiz-brandao avatar sgauna-rsk avatar toxaris-nl avatar waskosky avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.