Hacker Newsnew | past | comments | ask | show | jobs | submit | mkhorton's commentslogin

Not for superconductivity specifically, but for a broad range of properties of crystals, this is what the Materials Project[0] does.

Materials Project is funded by the US Department of Energy and uses supercomputing to simulate hundreds of thousands of different crystal structures on the quantum mechanical level to try and find those which have useful properties for practical applications.

This line of research is broadly called “materials discovery”, “materials design” (often “high throughput”) or even “materials genomics” depending on who you ask. These terms are provided in case anyone wants to search and read more about it.

[0] https://materialsproject.org


Thanks both for the appreciation, it's really nice to see! Will forward to the team :)


> Is there any way I could use this to see if there was merit in that idea?

It likely can't give you an instant answer, but it can be a good starting point for a research project. For example, Materials Project has information about the dielectric properties of a material, has datasets for electron conductivities, vibrational (phonon) properties and the like. So you would start by searching the dataset for the properties of interest to get a shortlist of candidate materials, and then do more focused studies based on those.

Note that the Materials Project does also have known materials in its database that are currently used extensively in real-world devices too, so it can also be used to provide additional information about those materials. In this way, if you're looking for an improvement on an existing material, you can start with a known-good material and see if similar materials might exist that offer an improvement on your property of interest.


> How do projects like this deal with papers published based on falsified data? Do they reproduce any of the source data themselves?

I can't speak to this specific instance, but Materials Project does try to pay close attention to questions of reproducibility and provenance. Materials Project runs open-source repos[0] so that its methods can be verified, individual calculations are available via an API[1] and we also partner with NOMAD[2] to make larger files and calculation artifacts available for direct download. This is in addition to documenting methods via peer-reviewed papers, online docs, etc.

This is not to say that issues of reproducibility don't still exist, or that we ourselves couldn't be doing better. It's a big problem in the community.

[0] https://github.com/materialsproject [1] https://api.materialsproject.org/docs [2] https://www.nomad-coe.eu


I would agree with your comment, but I think it's fair to ask this question. Discovering new materials can have many unintended consequences, especially if they contain elements that are not earth abundant or have high costs (environmental, personal) associated with their extraction.


Yes, this is almost exclusively a computational resource, with the exception of experimental data contributed by third parties. Most of our compute comes from the lovely people at NERSC[0].

All our predictions are benchmarked against experimental data wherever possible, but it's always a balancing act between things that can be calculated reliably and at scale, and the latest-and-greatest methods which give the most accurate predictions.

[0] https://www.nersc.gov


We have a mechanism for upload of experimental data (MPContribs[0]), that can then be linked back to the Materials Project's "material detail pages" for a given material. This also then provides a public API for bulk download of this data. We hope this will help make relevant experimental data more discoverable.

[0] https://contribs.materialsproject.org


There are a few differences, but broadly MatWeb is more useful for manufacturing and has a broader range of materials available (including plastics, extensive metallic alloys, etc.) and real world properties. These are materials you might purchase and use today.

In contrast, the Materials Project are computed predicted information on inorganic crystals (typically, ideal, on-stochiometric crystals), that might be used for many different device applications like solar, optoelectronics, batteries, etc. Many of these crystals will not be available to purchase and will need to be grown in a laboratory, and Materials Project is therefore much more focused towards active research into new materials.


Hi everyone, fun to see The Materials Project make the front page! I work on this, happy to answer any questions.


The page says the data is licensed under CC-BY (presumably in countries that have sui generis database protection, rather than countries like the US where facts aren't copyrightable). This is great!

Is there a torrent? How can we ensure that this treasury of materials knowledge is preserved 64, 256, or 1024 years into the future, even if, for example, the US goes to war against Russia or China and decides to criminalize exporting materials data?


In the short (~decade) term, we do tape backups of calculation data in Berkeley, and offload data to an independently-funded European project (NOMAD), to ensure data is in at least two locations. Likewise, our production databases are automatically backed up in the cloud, but we also keep a local mirror on a bare metal server. In the longer 2^6-year time frame or further out still, I would just be flattered if the data is at all still useful for people. I think it's fair to say our community has a lot of challenges to face before we get to that point.

We don't seed any torrents ourselves and only support API access (mainly because we're a small team and have to focus our effort), but with the open license I hope the data can live on wherever/however it can.


If someone were to try to do a bulk download of the data (well, or whatever they thought was the most significant data) through the API for preservation purposes, might it put an undue load on your server infrastructure? Some kind of bulk data download might be useful insurance there.

There seem to be some interesting efforts to run SQLite in the browser so that server infrastructure only has to provide bulk data access, with precomputed indices to avoid full table scans; I wonder if those might be applicable here: https://blog.ouseful.info/2022/02/11/sql-databases-in-the-br... (though of course if you aren't using SQLite as your backend now it might be a headache)

Such an approach, if it were feasible, would have the advantage that bulk data downloads wouldn't look very different from normal use.


This would be a much bigger conversation, the SQLite efforts are very cool.

Short answer to your question is that the API load should be fine (I regularly download large subsets of the database myself via the API for research purposes), although there are good and bad ways of writing API queries. We have some tutorials, workshops, etc. available to help newcomers to our API write good queries.

We also have an email address set up (heavy.api.use@materialsproject.org) where people can give us a heads up if they are concerned about putting an undue load on our servers; as much as we try to have reasonable automatic limits set, sometimes we have had issues! API traffic continues to grow too, which in some ways is a nice problem to have, but does mean this is a moving target.


That's good to hear!


Is this aimed at inorganic materials in general, or are there areas of specialization like say, industrial catalysts for fluid-bed processes etc?


It is aimed at inorganic materials in general, and many of the calculations are bootstrapped from existing experimental crystal databases.

However, this is not to say there aren't some biases. A lot of the Materials Project collaborators work on battery research, so there is some bias towards battery materials. But people have used MP to search for new photocatalysts, for example (or carbon capture materials, new phosphors, thermoelectrics for solid-state refrigeration, lead-free piezoelectrics, transparent conductors, etc.. the list goes on).


Is there any way to request adding a new theoretical material? There are a couple of scandium based compounds that could theoretically exist that I think would be interesting to reason about.


Absolutely, yes. Materials Project runs a service called "MPComplete" where people can submit structures to "help complete the database." There's an API, or we're working on a new drag-and-drop interface on the website to quickly upload a CIF or similar.

By all means email me at mkhorton@lbl.gov if you're interested and I can sort it out.


Hi there! I'm hoping to learn more about Materials Project. Is your team aware that on the website, the documentation links are not working?


No, I was not aware, thanks for reporting! Have we missed a link somewhere? Docs link is https://docs.materialsproject.org and is online.


lol hi matt! Interesting finding Berkeleytheory folks out in the wild


hi Alex! :)


Materials Project, Lawrence Berkeley National Laboratory | Web Developer | Berkeley, CA, USA | Onsite | https://materialsproject.org https://lbl.gov

Mission: We are a group of academic researchers who create and curate the Materials Project, the world's leading database of crystalline materials that is freely available for people to query to find materials for applications such as energy, batteries, solar, water splitting, optoelectronics and more. Our user base is growing exponentially (now >120k) and includes a wide range of people, from students who are just encountering materials science for the first time, to academic researchers and industry users. We’re now in the process of building a new frontend for the website to meet some key needs that have arisen as the project has grown, as well as to share some of the latest data we’ve been generating which will require deep thought in how best to make this data accessible and understandable to the broadest possible audience. If this sounds exciting to you, please get in touch. The Materials Project was founded in 2011.

Technologies: This is a good time to start working with us since we're at the early stages of designing our new frontend, and you will have an opportunity to help us shape what that looks like. We've settled on React and TypeScript for our core technologies, and are committed to modern best practices where possible. Due to the large number of Python developers in our team, we will also be making heavy use of the Plotly Dash framework, and extending this using custom React components, so some Python familiarity will also be useful. All the code we write is open source <3 you can find our code at https://github.com/materialsproject

Team: You will be joining a small team of four core developers, along with a larger research group of many postdocs and graduate students here at LBL, and also interacting with our collaborators worldwide. COVID statement: This is an on-site job, however we are currently working remote and have been given guidance to expect this to continue until the end of September.

The official job ad, further details on how to apply, and our equal employment opportunity statement are all available here: https://lbl.referrals.selectminds.com/jobs/web-developer-the...

Please note that this ad is a re-post from June, and we are currently interviewing candidates. However if the job ad link is still active then that means we are still accepting applications. We look forward to hearing from you!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: