stefan-endres / dwpm-mixture-model Goto Github PK
View Code? Open in Web Editor NEWPhase seperation calculation using the DWPM mixture rule.
Phase seperation calculation using the DWPM mixture rule.
In tgo.py:
k_c = numpy.floor((-(ep - 1) + numpy.sqrt((ep - 1.0)--2 + 80.0 * ep))
/ 2.0)
There is a --2
which I am pretty sure is not supposed to be there. Is it perhaps supposed to be **2
instead?
The build unittests are failing because the config.cfg file is not in the public GitHub repository, when a customized one is added it cannot find the correct directory where the data is stored locally.
Is there anyway to fix this? Maybe by adding a remote data storage for the .csv files?
Tests should run properly.
There are many commented-out pieces of code in the repository. If they are not used anymore, delete them. You can always get them back from the git history. I'm also not sure that all the files in the repository are currently doing something important (see #12). It is important that only working, used code is in the master branch. If you have some other stuff you're playing with that's not yet ready for prime time, use a new branch. If you have some old stuff which was for testing or for a different purpose, delete it.
The beauty of having a version control system is that you don't need to keep the old versions of files around. If a file is no longer being used, just delete it. The same goes for code. Remember you can always look at the older versions or retrieve a file from an old version, so you can delete with impunity and have the current repository only contain the stuff that is needed right now.
You currently have two files in the repository with the same name but different casing (see above). You may not be able to see this on your local copy, but you can verify that it is the case by looking at the list of files on GitHub. If you have renamed nComp_test.py to ncomp_tests.py you should have done "git mv nComp_test.py ncomp_test.py". The key point is that git is case sensitive with filenames and that the current situation is causing issues with my mac which has a case-insensitive filesystem. Please do "git rm whichever_file_has_wrong_contents" and then if necessary, do "git mv correct_file_contents correct_file_name".
There are quite a few pieces of copy-paste cloning in the code base at the moment. Worse, some of them are not exactly the same and differ in small difficult-to-see ways.
Run clonedigger on the code and examine the output.html
file it will create. If there is redundant code which has been superceded by new code, delete the old code. If both clones are still used, combine them to a common function/method.
I recommend that you develop working tests before doing this task.
There are a couple of places in the code where print
is used to print warnings. The logging
module is in the standard library and allows for more fine grained control of user feedback. I recommend all the parts where warnings are printed be replaced by logging.warn()
. Most user output which is not nicely formatted (just messages) can be handled this way.
The routines in numpy
are probably faster and have better numerical behaviour than the routines in math
. Find all the stuff you use in math
and use the numpy
version instead. As an addendum, also use exp(x)
rather than e**x
, as this can be more accurate and better behaved since e
will only be known to finite precision, while an exponentiation function can calculate accurately without relying on a finite version of e
.
Wouldn't a pandas dataframe be a better option than the csvDict class?
I've started working on more polar systems which are so non-ideal that the Gibbs surface is not defined for most parameters (which incidentally also destroys a few proofs I was working on due to lack of continuity). This ruins the lagrange plane optimisation due to nan - nan
behaving as a zero float when added to the objective function's summation (a scalar float output).
Essentially what I'm trying to do now is to catch any RuntimeWarning
like:
ncomp.py:523: RuntimeWarning: invalid value encountered in log
- s.m['a'] / (p.m['R'] * s.m['T'] * V))
G_sol_I = [ 0.32783229]
G_sol_II = [ nan]
Which can be done with exception handling with a few simple modifications:
http://stackoverflow.com/questions/15933741/how-do-i-catch-a-numpy-warning-like-its-an-exception-not-just-for-testing
...and then add penalties to the objective function.
The question now becomes how to deal with this on a high level optimisation, the options as I see them are:
abs([invalid log value])
). However, this will require a major rewrite of most functions wherein we eiterPlease let me know what you think.
This is a high priority request. For me to help more in this code base, it is important to have tests which excercise most of the code and automatically check correct outputs. I am having trouble with some of the refactoring I'd like to do because I'm not sure I will notice if I introduce an error.
Please create a README file which explains what the main entry points to the code base are. I notice there are files called pure.py
binary.py
and nComp.py
, which I assume handle progressively harder problems, but it is not clear how it all fits together. A couple of lines in a README would help anyone to navigate the code a bit better.
This appears to be a typo, but worse, there doesn't seem to be anything like this in the data structure. Don't add code which breaks to the version control without notes that it will break.
There are far too many try
statements in the code as it stands now which encompas too much code. Especially egregious are the parts where you catch an exception only to raise one again (see for instance in data_handling.py
. In that case it is really bad because you print "Data not found" even if for instance the read failed for another reason.
There are some lines of testing code which are commented out. This is a bad pattern, as the way you would run this code now becomes "uncomment code, run, observe results, comment code again". It is far better to write this code as a separate file which can be run anytime. It is even better to have it written as a proper test. This pattern of having code hidden in comments which is to be run by commenting out section is to be avoided in all cases as it causes churn in the version control system.
Go to settings and rename the project to "DWPM-Mixture-Model"
The current workflow is 1. edit the config file 2. run the script. This has issues with for instance automating running the script for different sets of components in an automated fasion.
I recommend the following idea:
This gets rid of many conditionals in your simulation program, making it a lot cleaner. Your main.py
is a step in the same direction, but I think you will find splitting the plotting code from the calculation code with a data file in between makes a lot of the logic easier.
Use coverage to look at the coverage of your tests, which is not currently 100% in tgo.py
and write tests which at least cover all the code you have written.
I want to change some stuff on the base repo. Specifically I want to add automated testing checks via Travis.
I've registered the project at Code Climate. It would be useful for you to register a push hook there so that they will regenerate every time you push.
You appear to be using exec
quite a lot, but it is really not to be used often. I would suggest searching for all the places you use exec
and just rewriting those sections without exec
.
It is by and large not a good idea to store data in the repository. It's likely to change often, leading to messy commits. It can also get large, leading to large commits. If multiple people run the code, the data can also change differently leading to messy merges.
The pattern that I have found to work well is this:
.pyc
files are platform specific and should not be in the repo. You can add them to .gitignore so that they don't show up as dirty. In general, don't just add "all files" to the repo, be judicious.
Click here and enter alchemyst.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.