colinsongf Goto Github PK
Type: User
Company: data
Location: cd
Type: User
Company: data
Location: cd
pandas, scikit-learn, xgboost and seaborn integration
A paper about the IPython Notebook written in 2013
Convert AI papers to GUI,Make it easy and convenient for everyone to use artificial intelligence technology。让每个人都简单方便的使用前沿人工智能技术
a bot for paperweekly
Implementation of Paragraph Ranker for Open-Domain QA
Parallel Gradient Boosting Decision Trees
A parallel NLP Tokenization tool for transformers
Tutorial on scikit-learn and IPython for parallel machine learning
Parallel trainers using the python binding for MPI
Sparse Matrix Factorization (SMF) is a key component in many machine learning problems and there exist a verity a applications in real-world problems such as recommendation systems, estimating missing values, gene expression modeling, intelligent tutoring systems (ITSs), etc. There are different approaches to tackle with SMF rooted in linear algebra and probability theory. In this project, given an incomplete binary matrix of students’ performances over a set of questions, estimating the probability of success or fail over unanswered questions is of interest. This problem is formulated using Maximum Likelihood Estimation (MLE) which leads to a biconvex optimization problem (this formulation is based on SPARFA [4]). The resulting optimization problem is a hard problem to deal with due to the existence of many local minima. On the other hand, when the size of the matrix of students’ performances increase, the existing algorithms are not successful; therefore, an efficient algorithm is required to solve this problem for large matrices. In this project, a parallel algorithm (i.e., a parallel version of SPARFA) is developed to solve the biconvex optimization problem and tested via a number of generated matrices. Keywords: parallel non-convex optimization, matrix factorization, sparse factor analysis 1 Introduction Educational systems have witnessed a substantial transition from traditional educational methods mainly using text books, lectures, etc. to newly developed systems which are artificial intelligent- based systems and personally tailored to the learners [4]. Personalized Learning Systems (PLSs) and Intelligent Tutoring Systems (ITSs) are two more well-known instances of such recently developed educational systems. PLSs take into account learners’ individual characteristics then customize the learning experience to the learners’ current situation and needs [2]. As computerized learning environments, ITSs model and track student learning states [1, 6, 7]. Latent Factor Model and Bayesian Knowledge Tracing are main classes in ITSs [3]. These new approaches encompass computational models from different disciplines including cognitive and learning sciences, education, 1 computational linguistics, artificial intelligence, operations research, and other fields. More details can be found in [1, 4–6]. Recently, [4] developed a new machine learning-based model for learning analytics, which approximate a students knowledge of the concepts underlying a domain, and content analytics, which estimate the relationships among a collection of questions and those concepts. This model calculates the probability that a learner provides the correct response to a question in terms of three factors: their understanding of a set of underlying concepts, the concepts involved in each question, and each questions intrinsic difficulty [4]. They proposed a bi-convex maximum-likelihood-based solution to the resulting SPARse Factor Analysis (SPARFA) problem. However, the scalability of SPARFA when the number of questions and students significantly increase has not been studied yet.
The latent bag of words model for paraphrase generation
Examine two sentences and determine whether they have the same meaning.
Datasets for the paper "Improving the Robustness of Question Answering Systems to Question Paraphrasing" (ACL 2019)
A framework for training and evaluating AI models on a variety of openly available dialog datasets.
A natural language semantic parser
Tutorial materials for parsetron documentation
CCIR code\ MRC task
Learning to represent shortest paths and other graph-based measures of node similarities with graph embeddings
PathNet model for Multi-hop Reading Comprehension (https://arxiv.org/pdf/1811.01127.pdf)
Describing statistical models in Python using symbolic formulas
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importance of modeling structure, context, and word order information for the problem of paraphrase identification.
Eigenvector decomposition and Singular Value Decomposition Implementation, applied to predicting movie ratings (Netflix problem)
A PDF processor written in Go.
The code of ACL 2020 paper "You Impress Me: Dialogue Generation via Mutual Persona Perception"
This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.