Git Product home page Git Product logo

20f-artificien's People

Contributors

alexquill avatar epsteinj avatar kenneym avatar shreyas-v-agnihotri avatar timofei7 avatar tobiaslange18 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

20f-artificien's Issues

gamify app, make user want to sign up

Motivations:

  • As Danielle the data-producing consumer, I want to have an enjoyable user experience on the app that I am using.

When creating an application that can gather data from users, we want the app to be genuinely enjoyable such that there is high user retention (and we can gather more complete data on users), high usage of the app (more usage means more data), and that our users do not feel like they are being taken advantage of in any way.

Page Built: "Models"

Motivations:

  • As Evan the EBITDA Evangelist, I want to be able to see how my models are performing as well as models that are completed.

Develop barebones client dashboard

Motivations:

  • As Evan the EBITDA Evangelist, I want to navigate the Artificien marketplace dashboard so I can browse what kind of data is best for me

This dashboard will be a simple effort, with some buttons and views that allow potential clients to see what kind of offerings we have. This website will begin as simple, allowing for further styling, pages, and data-sorting complexity as we expand our dataset offerings.

Page Built: "Models"

Motivations:

  • As Evan the EBITDA Evangelist/Helen the Healthcare Researcher, I want to be able to see my models in progress and models that have completed

Jupyter user accounts and model upload

Motivations:

  • As Evan the EBITDA evangelist, I want to partner with Artificien by making an account for my firm on their dashboard/marketplace so I can train my stupid models on huge sets of their proprietary data

A vital functionality of the barebones dashboard/marketplace will be the ability for our clients to upload their models to be trained once they've agreed to partner with Artificien. They will do so through their client accounts, which must be strongly and securely authenticated. This model will be that which we train in a federated way on the data of the client's choice.

Page Built: "Landing page"

Motivations:

  • As Evan the EBTIDA Evangelist, I want to be able to access the Artificien Landing page and see the pretty graphics and login!

Integrate with & develop method to access data collected and stored in app developer created DBs

Motivations:

  • As #5, I sometimes prefer to build my apps with a database that stores all relevant user profile information and all relevant user data, rather than keeping any data stored on-device. While my devices keep a bit of data transiently, they mainly just send data to my Database, which keeps the data longer-term.

  • In order for Artificien to access mobile data in this case, we need to integrate with #5's database, rather than integrating with mobile devices themselves.

  • In order to handle these cases, we will need to develop a method to integrate with app developer DB's. We need a method to convert Firebase, MongoDB, and other types of DBs into federated "worker nodes."

Page Built: "Data Library"

Motivations:

  • As Evan/Helen, I want to be able to see the data I can select, so I can run my model with a specific use-case in mind

Evan the EBITDA Evangelist

Evan the EBITDA Evangelist

sketch/picture

Background and Demographic Information

  • Nickname: Evan the EBITDA Evangelist
  • Demographics: A profit loving hedge fund manager who wants to use accessed consumer data to generate alpha
  • Overheard quote: "I'm paying those bastard data boutiques tens of millions for consumer datasets just so I can meet beta and compete with Ackman daddy down the street. If I can pay less and get unique, structured consumer data hell yeah!"

Narrative

I shelled out millions of dollars to a data boutique for a dataset on Walmart customers. Turns out all my competitors purchased this data, defeating the purpose of generated alpha with data and quant driven strategy in the first place. If I want to be the hottest hedge fund manager on the street I need unique, structured, and large consumer datasets that no one else has. If Artificien can give me access to this I'd shell out big bucks.

Behavioral and Dimensional Information

  • Goals and Motivations:
    - Get access to large consumer datasets that pertain to my investing strategy
    - Have access to data no one else has access to to generate alpha
    - Make big money and return EBITDA (Earnings Before Interest Tax Depreciation and Amortization)
  • Tasks:
    1. Develop some thesis about an equity, market, or financial instrument
    2. Buy data access from artificien to information that pertains to thesis
    3. Validate thesis and develop quantitative strategy around data
    4. Deploy strategy to market, generate alpha
    5. Exit positions with big gains
    6. Rinse and repeat 1-5 for the lifecycle of fund
    7. Return EBITDA less 10/20 to limited partners
  • Pain Points, Concerns, and Challenges:
    - Have trouble finding places to buy good data - data boutiques are often sketch
    - Have trouble finding data at reasonable cost - data boutiques are often super expensive
    - Have trouble finding live datasets
    - Have trouble finding data no one else has
  • User Flow
    - Evan and his quants develop some thesis on financial markets, equities, or instruments
    - Evan comes to artificien looking for a specific data type - i.e. Walmart customers
    - Evan pays Artificien big bucks to access the data he's looking for. Evan pads the going rate for this data significantly such that we don't resell access to this data to another party for some period of time
    - Evan & qunats trains their quant models in a federated way, defining his strategy and validating his thesis
    - Evan takes his learned models and deploys them.

Convert regular ML algorithm to FL algorithm

Motivations:

  • As Helen the Healthcare researcher and Evan the EBITDA Evangelist I want to be able to input my machine learning algorithm and have it converted to an FL algorithm so that Artificien can run the algorithm in a federated way on the dataset I have selected.

Federated Learning is not incredibly well known, and we do not have expectations that all devs who use our platform have models built out in a federated way. As a result, we will be able to convert traditional machine learning models into federated learning models such that we can apply these models on user data.

PyGrid API Trial Run

Get an example of pygrid api running

  • Figure out how to send model training plans, model itself, and client-side configuration parameters to PyGrid and do a trial run where we actually train a model on some devices.

Paul the Profit-Prioritizing Programmer

Paul the Profit-Prioritizing Programmer

sketch/picture

Background and Demographic Information

  • Nickname: Paul the Profit-Prioritizing Programmer
  • Demographics: 25-year old app developer, Hispanic, Middle-class, bootcamp-educated, lives in the Bay Area
  • Overheard quote: "I'm really tryna sell this user data but GDPR and CCPA make it such a pain to collect data on my users and monetize it to make fat stacks!"

Narrative

Short narrative or description about the user and why they're using your product/service (try to capture their attitudes, needs, problems/concerns, and experience)
Paul is an app development and marketing whiz. After becoming interested in coding mobile apps for iOS and Android, he successfully completed a coding bootcamp in the Bay Area that taught him what he needed to make habit-forming products that stick. He recently finally broke through in the consumer app space with a 3rd-party dating app that he hopes will compete with Tinder and Bumble. The app has racked up 100,000 downloads and he's looking to monetize, because he still needs to pay back the bootcamp's initial investment. In-app ads are insufficient as a revenue stream given his user base, and he knows that the data he collects on his users is quite valuable (their likes and dislikes, demographics, and more). Still, recent consumer privacy laws and the public data privacy uproar makes him unsure how to properly monetize his data without selling it directly to nefarious agents without his user's permission. He hopes to use Artificien to give him a piece of the pie for making his user's data accessible without selling it on the black market or breaking any privacy laws.

Behavioral and Dimensional Information

  • Goals and Motivations:
    (goals should directly relate to product/service,
    what are they trying to accomplish)

    • Collect data about users in a safe and privacy-centric way
    • Identify other developers, researchers, and businesses who would be interested in the user data his platform collects
    • Better monetize his product to make it a more steady stream of income
  • Tasks:
    (break goals down into tasks — what does the user need to do to accomplish a particular goal)

    1. Ask for user permission to collect personal data (self-inputted) and phone data (location). Store data in an encrypted and secure server with well-written authorization/authentication rules.
    2. Reach out to market research companies who are willing to pay for anonymized consumer data. Contact other app developers through online forums offering to sell data.
    3. Set up in-app ads and cross-promotions with other products. Experiment with freemium and paid pricing models.
  • Pain Points, Concerns, and Challenges:
    (what are they worried about? what do they have trouble with?)

    • Paul is worried that his users will shy away from using his app or will be angry if they are notified that he is selling their data to third parties
    • Paul doesn't know how to compete with giants like Facebook and Google who can offer extremely targeted and valuable personal information to advertisers for a very low price. His data is valuable too, but advertisers are hard to find without a user base like theirs.
    • Paul doesn't know how to find people willing to pay for his data.
    • Paul is concerned his app won't make enough money to sustain him, and that he'll have to go back to a bootcamp or accept an entry-level software engineering job that doesn't let him pursue his passion to build apps start to finish.
    • Paul is concerned about his user's data falling into the wrong hands or having a massive security breach.
  • User Flow
    (describe a typical scenario of the user interacting with your product – this is a short ordered list of actions)

    1. Paul logs into the Artificien dashboard and registers his app with its unique ID
    2. Paul browses and understands the range of data-gathering partners Artificien works with
    3. After registering, Paul follows the Artificien onboarding tutorials and documentation to integrate the API into his app at key junctures when user data is collected.
    4. Paul notifies his users during the data permissions step that their data may be used for data training purposes but will never leave their device.
    5. Paul monitors app usage and crash data to ensure that Artificien's on-device processing does not burden his users. he finds nothing to suggest it.
    6. As time goes on Artificien automatically connects Paul with data-buyers looking for the type of data his app offers.
    7. Paul logs back into his Artificien dashboard to see how his data-sharing credits have accumulated as a result of third parties using his data in a federated way.
    8. Paul cashes out on the Artificien platform through a secure checkout process, giving him the cash he needs keep his app and his dreams afloat.

Danielle the Data-producing consumer

Danielle the Data-producing Consumer

sketch/picture

Background and Demographic Information

  • Nickname: Danielle
  • Demographics: Our consumers should specifically span a bunch of demographics to produce sufficiently diverse data, but for the sake of the exercise Danielle is a college student. Danielle is active on her smartphone, using a variety of apps that partner with Artificien for all her banking, music, insurance, and shopping needs. Danielle is not only an active cell phone user, she also incorporates a variety of wearables into her day-to-day routine for working out and communicating. She lives in a modern apartment with a built-in smart-home spearker system, an amazon alexa, and a bunch of different IoT-incorporated appliances. She is a data treasure trove.
  • Overheard quote: "Yeah, I just clicked 'agree' on the privacy policy without reading into it too much. I heard that apps are taking my data but it's not like I have anything to hide, so I don't really care. All I know is that I was talking to my roommate about lululemon yesterday and now I'm getting ads for their clothes on my computer - Zuckerberg at it again!"

Narrative

Short narrative or description about the user and why they're using your product/service (try to capture their attitudes, needs, problems/concerns, and experience)

Danielle is the only user persona that gains very little actual utility from Artificien, and yet she is arguably the most vital to our success. Consumers are the source of all the data that gives Artificien value - Danielle, her device, and the rest of the sources of data in her life help build the integrated datasets that drive our clients' insight.

Danielle may be skeptical about her data being "sold" to Artificien by the apps and devices she uses. This is an ideological challenge that we will have to overcome as privacy concepts are pushed more into the everyday vernacular of consumers. That being said, one of Articien's biggets value props is that we don't own data in the same way that large tech companies have been scrutinized for - in fact, her privacy is our biggest concern, and our federated learning methods offer a new way to derive insight that will help further important healthcare, advertising, etc initiatives while also keeping the intimate details of her life private.

A final advantage of Artificien and a need of Danielle's to consider is minimal inconvenience to her user experience. Artificien's API will allow developers to seamlessley integrate her data transfer, ensuring that her experience with the apps and devices she uses every day won't even be slightly compromised

Behavioral and Dimensional Information

  • Goals and Motivations:
    (goals should directly relate to product/service,
    what are they trying to accomplish)
    • Danielle is using her apps and devices just as she does every day
    • Danielle is choosing apps and devices that most directly meet her needs in a smart, personalized way
    • Danielle is motivated by convenience and customization - her experiences in shopping, healthcare, and leisure should be closely tailored to her preferences.
  • Tasks:
    (break goals down into tasks — what does the user need to do to accomplish a particular goal)
    • Danielle needs to download and use the apps/devices with which we partner every day
  • Pain Points, Concerns, and Challenges:
    (what are they worried about? what do they have trouble with?)
    • Danielle is concerned in an abstract way with where her data goes
    • Danielle does not have the technical experience to understand federated learning
    • Danielle is skeptical of faceless companies and of how her data could be used not to improve her user experience, but to charge her more on certain products
    • Danielle has heard in the news that big tech is maliciously creating a detailed profile on her, and has removed a bunch of vital permissions for apps on her phone
  • User Flow
    (describe a typical scenario of the user interacting with your product – this is a short ordered list of actions)
    • Danielle signs into her amazon mobile app to do some Christmas shopping
    • Danielle browses the amazon marketplace, adding three pairs of shoes - two nike athletic, one dress shoe - to her amazon wish list
    • Danielle takes a look at some protein powder and a pair of at-home dumbbells but does not add them to her list
    • Danielle signs off
    • Danielle's data is used to train a local clothing retailer's item suggestion algorithm for their app. This retailer is using Artificien to train on 10,000 amazon users statewide. The next time Danielle decides to shop online locally, this predictive algorithm knows to add protein powder to the list of items she "might be interest in". Not only is her experience personalized, but her updated local model is sent back to Artificien HQ, where it is aggregated with 9,999 other updated predictive models to create the smartest algorithm ever for this local retailer.

Collect Apple Health data on iOS app

App will be written in Swift and allow some form of data collection, through either:

  • User input (e.g. credit score)
  • Device hardware (e.g. location)
  • App integrations (e.g. Apple Health data)
    The app will also store the data locally on device, without sending it elsewhere.

Motivations:

  • As #4 Danielle the Data-Producing Consumer, I want to use my apps without worry so I can maintain my privacy

Andrew the Altruistic App Developer

Andrew the Altruistic App Developer

insert a team selfie here as proof of meeting

Background and Demographic Information

  • Nickname: Andrew the Altruistic App Developer
  • Demographics: 20-something upper-middle-class male who was a CS major at a liberal arts school. While he builds apps for a living, he has been heavily influenced by his classmates' sentiment surrounding privacy, and strives to make applications that he feels are not intrusive in any way and are simply enjoyed by his users. He attempts to take only the bare minimum of data from his users.
  • Overheard quote: "This whole Federated Learning model really opened my eyes to how data access can be both ethical and democratized."

Narrative

While Andrew is a good guy, he also needs to keep the roof over his head and sustain his business. He has seen his peers making immense amounts of money from selling their user data and seen others achieve similar levels of success from putting advertisements within their applications. He doesn't love either of these ideas - the existing data selling solutions are intrusive and the ads dramatically lessen his users' experience. Artificien is attractive to him because of the "access to data" value prop over data itself. He knows he can go to sleep at night when partnering with our product.

Behavioral and Dimensional Information

  • Goals and Motivations:
    • Andrew is disenchanted with the current way app developers are making their money (advertisements, selling user data)
    • Andrew wants to sustain his business without invading his users' privacy or changing the user experience
    • Andrew essentially wants an ethical way to make money to keep his app afloat through a trusted partners
  • Tasks:
    • Understand how secure federated learning is
    • Start collecting more user data
    • Andrew needs to add an Artificien API
    • Wil be paid anytime a business purchases access to his user data
  • Pain Points, Concerns, and Challenges:
    • Has trouble making money on their application in an ethical way; wants to keep application running
    • Doesn't feel secure building in random APIs without vetting the business
    • No background on federated learning
    • Number one priority is user experience/privacy
  • User Flow
    1. Andrew logs into the Artificien dashboard and registers his app with its unique ID
    2. Andrew browses and understands the range of data-gathering partners Artificien works with
    3. After registering, Andrew follows the Artificien onboarding tutorials and documentation to integrate the API into his app at key junctures when user data is collected.
    4. Andrew notifies his users during the data permissions step that their data may be used for data training purposes but will never leave their device.
    5. Andrew monitors app usage and crash data to ensure that Artificien's on-device processing does not burden his users. he finds nothing to suggest it.
    • Andrew, once he integrates with our API, doesn't heavily interact with our product in a day-to-day setting
    • The big lift is getting him to put in our API, and then everything is smooth sailing from there
    • He will receive payouts when a business on the other side of our marketplace wants to purchase access to his user data
    • Andrew will not have a page in the marketplace - instead of enterprises specifically selecting his users' data, they will outline the kind of data they want and Artificien will run FL on the right data that was outlined by the buyer

Marketplace UI/UX

Motivations:

  • As #3 Helen the Healthcare Researcher, I want to have an easy time navigating the Artificien platform so I can easily select data and upload a model without confusion.

This is just related to styling the Artificien client dashboard (better UI/UX through CSS and better tabulation) to make it a more delightful experience to browse data options and develop/upload a model.

Helen the Healthcare Researcher

Helen the Healthcare Researcher

sketch/picture

Background and Demographic Information

  • Nickname: Helen the Healthcare Researcher
  • Demographics: 40 year old researcher, female. She is always looking for novel datasets and novel means of conducting research - but her peers always insist on doing it the 'old' way.
  • Overheard quote: "I've always said - the lab is not enough! We may have good control over our independent variables - but it's not real world data! Lets use the information generated out in the wild - real data - from people going about their real lives!"

Narrative

Helen was at first thrilled about going into research. She was young, coming out of a well-established PhD program, and was ready to take on the world. After a decade in healthcare research, she is now jaded and unimpressed with the slow pace of research and the archaic methods still employed today. She knows there is a vast trove of information being generated on mobile that could help her to identify the root causes of disease, and to understand human health and behavior better, but cannot find a good way to get this data! Every time she proposes mobile-enabled research, she receives push back regarding the privacy concerns, she's told 'it won't work' by her peers, and struggles to get users on board.

Behavioral and Dimensional Information

  • Goals and Motivations:
    • Utilize data collected on mobile to learn more about human health and behavior
    • Utilize health metrics in combination with medical diagnoses in order to perform root-cause analysis for diseases
    • Learn how to screen patients for disease or health problems using safe and secure screening methods that can be deployed on mobile
  • Tasks:
    1. Build an application or partner with an app developer that collects data relevant to health analytics (e.g. accelerometer data, sleep patterns, eating habits, search history (searches can show patterns of sentiment and oftentimes those with mental health problems or other health problems will search about the issues they are facing online), social interaction information (texts/ social media usage/ calls/ Bluetooth proximity sensors showing how many others a user is close to), GPS information (demonstrates information such as how often a person spends at work, how often they exercise and for how long, how often they socialize (GPS + Bluetooth proximity sensors)))
    2. Develop an SDK that can be utilized in iOS, Android, or Web apps, which allows health researchers like Helen to access the data collected in these apps in a federated way
    3. Partner with app developers collecting data relevant to Ellen's research, or allow Ellen to build her own app using our SDK.
  • Pain Points, Concerns, and Challenges:
    (what are they worried about? what do they have trouble with?)
    • Currently, mobile enabled studies are extremely hard to conduct because
      • Typically, researchers need to collect raw, personally identifiable information on users in the study, resulting in significant privacy concerns on the user end
      • These types of studies often require researchers to build a custom application to run their study
      • Getting users to download a custom app is difficult, and the types of users who will actually install such an app represent a highly biased subset of the population
    • Current Experimental set ups are good at controlling independent variables and providing statistical power, but they do not typically generate 'ecologically valid' data. For example, there are studies that show that, unequivocally, there are large differences in brain activity when looking at a picture of a face on a screen, in comparison to looking at an actual live human face. Most studies rely on doctored analogs to real life scenarios, and the results of these studies are not necessarily valid in the real world.
    • Standard experimental set ups have difficultly in recruiting participants, and often lack sufficiently large datasets to make substantiated and statistically significant claims. Mobile-enabled research could greatly expand the participant pool.
  • User Flow
    • Helen can't get the data she needs through traditional means
    • Helen comes to Artificien, and either asks for access to data that we've already collected, or asks us to put together a custom data collection system for her.
    • If asking for data that we already have:
      • Helen pays Artificien to gain access to the data
      • Helen writes analyses to be sent to the the data-holding devices or builds ML models of some kind to model the data
      • Artificien distributes Helen's code to devices, where federated analytics/ learning occurs.
      • Artificien aggregates the results of the analysis such that no individual user/ device discloses personally identifiable information
      • Artificien returns the results of the analysis or training round to Helen
      • Helen uses these results to publish new research
    • If asking for new data
      • Helen writes an iOS/ Android/ Web app that will collect the data of interest
      • Helen utilizes the Artificien SDK to enable her app(s) to allow federated access
      • If Helen is willing to share the data collected by her study, she is given the option to share that data on the Artificien marketplace. Alternatively, if she wishes to monetize the data, we provide Helen a percentage of the revenues her app generates for Artificien.
      • Helen can also utilize Artificien's service to conduct federated analyses for herself.

Provide a federated access point to data stored in-app

Motivations:

  • In order to access the data stored on-device in a mobile app, we need to build an API into the app.
  • Once the app has been built with a "designated access point" to the data stored within the app, we can conduct federated learning and federated analytics on the data stored on all devices which have installed the app.
  • We will utilize SwiftSyft (Syft ecosystem integration with iOS apps) in order to convert devices with our boilerplate app into "worker nodes" - worker nodes will act as the data providers in this case: "SwiftSyft makes it easy for you to train and inference PySyft models on iOS devices. This allows you to utilize training data located directly on the device itself, bypassing the need to send a user's data to a central server. This is known as federated learning." https://github.com/OpenMined/SwiftSyft
  • We will add any necessary features on top of SwiftSyft to expand its existing capabilities to meet our use case

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.