Intersections of chemistry and statistics

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
2
down vote

favorite












I am asking this question for a friend who knows a lot of chemistry and is now studying statistics, primarily since he heard this is the age of data and one should know statistics. However, he is interested to know if there are works on the confluence of statistics and chemistry, of which he could not find many articles online.



When I say "works", I mean works similar in spirit to the intersection of differential geometry and statistics (which emerged as an answer to model the human brain), or pattern recognition and algebraic geometry, etc. I do not mean the chemical statistics of laboratories, like mean, median, mode of results obtained from experimental data. I mean serious statistics and chemistry.



If you are interested in his background, he knows mathematical statistics and probability, such as notions of convergence in probability, measure theory, analysis, theoretical statistics, etc.



It would be very helpful if you can suggest some references.










share|cite|improve this question





























    up vote
    2
    down vote

    favorite












    I am asking this question for a friend who knows a lot of chemistry and is now studying statistics, primarily since he heard this is the age of data and one should know statistics. However, he is interested to know if there are works on the confluence of statistics and chemistry, of which he could not find many articles online.



    When I say "works", I mean works similar in spirit to the intersection of differential geometry and statistics (which emerged as an answer to model the human brain), or pattern recognition and algebraic geometry, etc. I do not mean the chemical statistics of laboratories, like mean, median, mode of results obtained from experimental data. I mean serious statistics and chemistry.



    If you are interested in his background, he knows mathematical statistics and probability, such as notions of convergence in probability, measure theory, analysis, theoretical statistics, etc.



    It would be very helpful if you can suggest some references.










    share|cite|improve this question

























      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      I am asking this question for a friend who knows a lot of chemistry and is now studying statistics, primarily since he heard this is the age of data and one should know statistics. However, he is interested to know if there are works on the confluence of statistics and chemistry, of which he could not find many articles online.



      When I say "works", I mean works similar in spirit to the intersection of differential geometry and statistics (which emerged as an answer to model the human brain), or pattern recognition and algebraic geometry, etc. I do not mean the chemical statistics of laboratories, like mean, median, mode of results obtained from experimental data. I mean serious statistics and chemistry.



      If you are interested in his background, he knows mathematical statistics and probability, such as notions of convergence in probability, measure theory, analysis, theoretical statistics, etc.



      It would be very helpful if you can suggest some references.










      share|cite|improve this question















      I am asking this question for a friend who knows a lot of chemistry and is now studying statistics, primarily since he heard this is the age of data and one should know statistics. However, he is interested to know if there are works on the confluence of statistics and chemistry, of which he could not find many articles online.



      When I say "works", I mean works similar in spirit to the intersection of differential geometry and statistics (which emerged as an answer to model the human brain), or pattern recognition and algebraic geometry, etc. I do not mean the chemical statistics of laboratories, like mean, median, mode of results obtained from experimental data. I mean serious statistics and chemistry.



      If you are interested in his background, he knows mathematical statistics and probability, such as notions of convergence in probability, measure theory, analysis, theoretical statistics, etc.



      It would be very helpful if you can suggest some references.







      machine-learning inference chemometrics chemistry






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Sep 10 at 5:05


























      community wiki





      Landon Carter





















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          4
          down vote













          A few people will think this question is rather broad but I think it is answerable.



          It does however require you to get a bit more info from your friend about what he's after because statistics intersects chemistry in a multitude of places.



          If it's physical chemistry type stuff he's after, I'd start with Statistical Thermodynamics (wikipedia). Also G.S. Yablonsky is a good name to search for.



          If it's more stuff to do with experiment design and "There must be a way of squeezing more knowledge out of this data" type stuff, then there is a lot out there.



          Design of Experiments, Optimal design, Bayesian experimental design, are all links that should get your friend started with the question of how to apply statistics to complicated chemistry optimisation problems.



          There is a caveat. "Applying big data to chemistry and drug design" is an idea that seems to come up independently every few months, and it's one that hasn't met with much success. Usually, this is down to the quality of the chemical data itself - I certainly wouldn't want someone using the data from my PhD synthesis work without accounting for all the unmentioned variables, like the brand and quality of the reagents used, humidity on that day, phase of moon etc. As the phrase goes, "Garbage in, garbage out".



          If there is a more specific or different question you had in mind, don't hesitate to mention it, and I will attempt an answer.






          share|cite|improve this answer


















          • 1




            Thank you for such a nice answer. He said he found organic chemistry fascinating. I understand however that it is a bit difficult to find an intersection between organic chemistry and statistics. But when you say chemistry and statistics intersect in a large number of places, will it be possible to give some more examples or direct to articles or people? Thanks for all the help.
            – Landon Carter
            Sep 10 at 8:43










          • Statistical thermodynamics often "approximates away" statistics in one of the first few steps because sufficient atoms/molecules are available so all that is needed is the mode of the distribution in question. Solve for MLE, and from then on statistics are basically gone from the calculation. (This thought was triggered by a discussion with a physical chemistriy prof saying he doesn't get the statistics we chemometricians do - and a colleague answering "But you do lecture statistical thermodynamics!?" and the prof then saying that that is completely different from what we do)
            – cbeleites
            Sep 10 at 10:34










          • Yes, while many physical chemistry problems have been tackled with statistics (and maths in general), students mostly get shown the derivation only once before it's all packaged away as ready-to-use formulas. I wouldn't wholly blame the chemists for this though - the problems encountered in chemistry seem to be prone to going from trivial to analytically intractable just from a simple addition or change in assumptions. @Landon Carter I am currently between jobs so I will see if I can find more examples for you.
            – Ingolifs
            Sep 10 at 18:44


















          up vote
          0
          down vote













          I'm analytical chemist specialised in spectroscopy and chemometrics.



          Chemometrics is statistics for chemical questions/tasks/problems (similar to psychometrics, biometrics, etc.) and definitively a search term your friend should check out.




          works on the confluence of statistics and chemistry, of which he could not find many articles online.




          This may be because that intersection is a rather specialized field, and thus much smaller, than, say, organic chemistry.



          OTOH, it may also be due to not knowing the terminology to search for. Finding such search terms can be very difficult, particularly in these interdisciplinary and applied stats fields: many of them have their own terminology, as people from various disciplines may describe situations which are similar in their statistical aspects with very different terms. E.g. I'm currently looking into aspects of nested designs/data structures. Which I describe to chemists as hierarchical, to data base people as having 1 : n relationships, and which in the social sciences are known as clustered.



          A third possible reason is that at least over here in Germany, I'd estimate that far more such activities are going on in industry than in academia. Which may lead to lower priority for publication than implementation.




          • As @Ingolifs already pointed out, there's a whole lot of statistics in chemical process optimization, and that's related both to (chemical) analysis (PAT: process analytical technology) and synthesis.


          • The "Applying big data to chemistry and drug design" would be related to QSAR (quantitative structure-activity relationship)



          I do not mean the chemical statistics of laboratories, like mean, median, mode of results obtained from experimental data. I mean serious statistics and chemistry.




          I suspect that you do have a misconception here. In my experience, many of the simple analytical textbook situations have a habit of needing rather more serious statistics as soon as a little bit of reality happens. The textbook may be fine with a bit of mean and median and a hyperbolic confidence interval for a linear calibration function. With real life analytical data and its more complex structures of influencing factors and confounders as well as experimental constraints that do mess up any simple design for the experiment. E.g. I find myself today first using a mixed model to analyze the data I want to use as reference for calibration of some other measurements later on.



          So, applied does not equal easy (or "not serious") - if you take applied data seriously, you'll find that many assumptions made life in "more serious" theory easier do not hold in many application situations.



          I'd say that there's a huge amount of serious work still to be done in this area. At least I can say that astonishingly often the answer I get for my "this is the situation, we cannot use the easy case here, because of .... How can we approach this?" is: good question - please tell me once you know - there's no known solution so far.

          A colleague from stats/data analysis once complained about "us chemists" that we either have tasks that are no fun because the solution is obvious - or that so hard that they are impossible to solve ;-) (which I guess is pretty much the same for all those applied stats fields)






          share|cite|improve this answer






















            Your Answer




            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "65"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f366139%2fintersections-of-chemistry-and-statistics%23new-answer', 'question_page');

            );

            Post as a guest






























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            4
            down vote













            A few people will think this question is rather broad but I think it is answerable.



            It does however require you to get a bit more info from your friend about what he's after because statistics intersects chemistry in a multitude of places.



            If it's physical chemistry type stuff he's after, I'd start with Statistical Thermodynamics (wikipedia). Also G.S. Yablonsky is a good name to search for.



            If it's more stuff to do with experiment design and "There must be a way of squeezing more knowledge out of this data" type stuff, then there is a lot out there.



            Design of Experiments, Optimal design, Bayesian experimental design, are all links that should get your friend started with the question of how to apply statistics to complicated chemistry optimisation problems.



            There is a caveat. "Applying big data to chemistry and drug design" is an idea that seems to come up independently every few months, and it's one that hasn't met with much success. Usually, this is down to the quality of the chemical data itself - I certainly wouldn't want someone using the data from my PhD synthesis work without accounting for all the unmentioned variables, like the brand and quality of the reagents used, humidity on that day, phase of moon etc. As the phrase goes, "Garbage in, garbage out".



            If there is a more specific or different question you had in mind, don't hesitate to mention it, and I will attempt an answer.






            share|cite|improve this answer


















            • 1




              Thank you for such a nice answer. He said he found organic chemistry fascinating. I understand however that it is a bit difficult to find an intersection between organic chemistry and statistics. But when you say chemistry and statistics intersect in a large number of places, will it be possible to give some more examples or direct to articles or people? Thanks for all the help.
              – Landon Carter
              Sep 10 at 8:43










            • Statistical thermodynamics often "approximates away" statistics in one of the first few steps because sufficient atoms/molecules are available so all that is needed is the mode of the distribution in question. Solve for MLE, and from then on statistics are basically gone from the calculation. (This thought was triggered by a discussion with a physical chemistriy prof saying he doesn't get the statistics we chemometricians do - and a colleague answering "But you do lecture statistical thermodynamics!?" and the prof then saying that that is completely different from what we do)
              – cbeleites
              Sep 10 at 10:34










            • Yes, while many physical chemistry problems have been tackled with statistics (and maths in general), students mostly get shown the derivation only once before it's all packaged away as ready-to-use formulas. I wouldn't wholly blame the chemists for this though - the problems encountered in chemistry seem to be prone to going from trivial to analytically intractable just from a simple addition or change in assumptions. @Landon Carter I am currently between jobs so I will see if I can find more examples for you.
              – Ingolifs
              Sep 10 at 18:44















            up vote
            4
            down vote













            A few people will think this question is rather broad but I think it is answerable.



            It does however require you to get a bit more info from your friend about what he's after because statistics intersects chemistry in a multitude of places.



            If it's physical chemistry type stuff he's after, I'd start with Statistical Thermodynamics (wikipedia). Also G.S. Yablonsky is a good name to search for.



            If it's more stuff to do with experiment design and "There must be a way of squeezing more knowledge out of this data" type stuff, then there is a lot out there.



            Design of Experiments, Optimal design, Bayesian experimental design, are all links that should get your friend started with the question of how to apply statistics to complicated chemistry optimisation problems.



            There is a caveat. "Applying big data to chemistry and drug design" is an idea that seems to come up independently every few months, and it's one that hasn't met with much success. Usually, this is down to the quality of the chemical data itself - I certainly wouldn't want someone using the data from my PhD synthesis work without accounting for all the unmentioned variables, like the brand and quality of the reagents used, humidity on that day, phase of moon etc. As the phrase goes, "Garbage in, garbage out".



            If there is a more specific or different question you had in mind, don't hesitate to mention it, and I will attempt an answer.






            share|cite|improve this answer


















            • 1




              Thank you for such a nice answer. He said he found organic chemistry fascinating. I understand however that it is a bit difficult to find an intersection between organic chemistry and statistics. But when you say chemistry and statistics intersect in a large number of places, will it be possible to give some more examples or direct to articles or people? Thanks for all the help.
              – Landon Carter
              Sep 10 at 8:43










            • Statistical thermodynamics often "approximates away" statistics in one of the first few steps because sufficient atoms/molecules are available so all that is needed is the mode of the distribution in question. Solve for MLE, and from then on statistics are basically gone from the calculation. (This thought was triggered by a discussion with a physical chemistriy prof saying he doesn't get the statistics we chemometricians do - and a colleague answering "But you do lecture statistical thermodynamics!?" and the prof then saying that that is completely different from what we do)
              – cbeleites
              Sep 10 at 10:34










            • Yes, while many physical chemistry problems have been tackled with statistics (and maths in general), students mostly get shown the derivation only once before it's all packaged away as ready-to-use formulas. I wouldn't wholly blame the chemists for this though - the problems encountered in chemistry seem to be prone to going from trivial to analytically intractable just from a simple addition or change in assumptions. @Landon Carter I am currently between jobs so I will see if I can find more examples for you.
              – Ingolifs
              Sep 10 at 18:44













            up vote
            4
            down vote










            up vote
            4
            down vote









            A few people will think this question is rather broad but I think it is answerable.



            It does however require you to get a bit more info from your friend about what he's after because statistics intersects chemistry in a multitude of places.



            If it's physical chemistry type stuff he's after, I'd start with Statistical Thermodynamics (wikipedia). Also G.S. Yablonsky is a good name to search for.



            If it's more stuff to do with experiment design and "There must be a way of squeezing more knowledge out of this data" type stuff, then there is a lot out there.



            Design of Experiments, Optimal design, Bayesian experimental design, are all links that should get your friend started with the question of how to apply statistics to complicated chemistry optimisation problems.



            There is a caveat. "Applying big data to chemistry and drug design" is an idea that seems to come up independently every few months, and it's one that hasn't met with much success. Usually, this is down to the quality of the chemical data itself - I certainly wouldn't want someone using the data from my PhD synthesis work without accounting for all the unmentioned variables, like the brand and quality of the reagents used, humidity on that day, phase of moon etc. As the phrase goes, "Garbage in, garbage out".



            If there is a more specific or different question you had in mind, don't hesitate to mention it, and I will attempt an answer.






            share|cite|improve this answer














            A few people will think this question is rather broad but I think it is answerable.



            It does however require you to get a bit more info from your friend about what he's after because statistics intersects chemistry in a multitude of places.



            If it's physical chemistry type stuff he's after, I'd start with Statistical Thermodynamics (wikipedia). Also G.S. Yablonsky is a good name to search for.



            If it's more stuff to do with experiment design and "There must be a way of squeezing more knowledge out of this data" type stuff, then there is a lot out there.



            Design of Experiments, Optimal design, Bayesian experimental design, are all links that should get your friend started with the question of how to apply statistics to complicated chemistry optimisation problems.



            There is a caveat. "Applying big data to chemistry and drug design" is an idea that seems to come up independently every few months, and it's one that hasn't met with much success. Usually, this is down to the quality of the chemical data itself - I certainly wouldn't want someone using the data from my PhD synthesis work without accounting for all the unmentioned variables, like the brand and quality of the reagents used, humidity on that day, phase of moon etc. As the phrase goes, "Garbage in, garbage out".



            If there is a more specific or different question you had in mind, don't hesitate to mention it, and I will attempt an answer.







            share|cite|improve this answer














            share|cite|improve this answer



            share|cite|improve this answer








            answered Sep 10 at 5:28


























            community wiki





            Ingolifs








            • 1




              Thank you for such a nice answer. He said he found organic chemistry fascinating. I understand however that it is a bit difficult to find an intersection between organic chemistry and statistics. But when you say chemistry and statistics intersect in a large number of places, will it be possible to give some more examples or direct to articles or people? Thanks for all the help.
              – Landon Carter
              Sep 10 at 8:43










            • Statistical thermodynamics often "approximates away" statistics in one of the first few steps because sufficient atoms/molecules are available so all that is needed is the mode of the distribution in question. Solve for MLE, and from then on statistics are basically gone from the calculation. (This thought was triggered by a discussion with a physical chemistriy prof saying he doesn't get the statistics we chemometricians do - and a colleague answering "But you do lecture statistical thermodynamics!?" and the prof then saying that that is completely different from what we do)
              – cbeleites
              Sep 10 at 10:34










            • Yes, while many physical chemistry problems have been tackled with statistics (and maths in general), students mostly get shown the derivation only once before it's all packaged away as ready-to-use formulas. I wouldn't wholly blame the chemists for this though - the problems encountered in chemistry seem to be prone to going from trivial to analytically intractable just from a simple addition or change in assumptions. @Landon Carter I am currently between jobs so I will see if I can find more examples for you.
              – Ingolifs
              Sep 10 at 18:44













            • 1




              Thank you for such a nice answer. He said he found organic chemistry fascinating. I understand however that it is a bit difficult to find an intersection between organic chemistry and statistics. But when you say chemistry and statistics intersect in a large number of places, will it be possible to give some more examples or direct to articles or people? Thanks for all the help.
              – Landon Carter
              Sep 10 at 8:43










            • Statistical thermodynamics often "approximates away" statistics in one of the first few steps because sufficient atoms/molecules are available so all that is needed is the mode of the distribution in question. Solve for MLE, and from then on statistics are basically gone from the calculation. (This thought was triggered by a discussion with a physical chemistriy prof saying he doesn't get the statistics we chemometricians do - and a colleague answering "But you do lecture statistical thermodynamics!?" and the prof then saying that that is completely different from what we do)
              – cbeleites
              Sep 10 at 10:34










            • Yes, while many physical chemistry problems have been tackled with statistics (and maths in general), students mostly get shown the derivation only once before it's all packaged away as ready-to-use formulas. I wouldn't wholly blame the chemists for this though - the problems encountered in chemistry seem to be prone to going from trivial to analytically intractable just from a simple addition or change in assumptions. @Landon Carter I am currently between jobs so I will see if I can find more examples for you.
              – Ingolifs
              Sep 10 at 18:44








            1




            1




            Thank you for such a nice answer. He said he found organic chemistry fascinating. I understand however that it is a bit difficult to find an intersection between organic chemistry and statistics. But when you say chemistry and statistics intersect in a large number of places, will it be possible to give some more examples or direct to articles or people? Thanks for all the help.
            – Landon Carter
            Sep 10 at 8:43




            Thank you for such a nice answer. He said he found organic chemistry fascinating. I understand however that it is a bit difficult to find an intersection between organic chemistry and statistics. But when you say chemistry and statistics intersect in a large number of places, will it be possible to give some more examples or direct to articles or people? Thanks for all the help.
            – Landon Carter
            Sep 10 at 8:43












            Statistical thermodynamics often "approximates away" statistics in one of the first few steps because sufficient atoms/molecules are available so all that is needed is the mode of the distribution in question. Solve for MLE, and from then on statistics are basically gone from the calculation. (This thought was triggered by a discussion with a physical chemistriy prof saying he doesn't get the statistics we chemometricians do - and a colleague answering "But you do lecture statistical thermodynamics!?" and the prof then saying that that is completely different from what we do)
            – cbeleites
            Sep 10 at 10:34




            Statistical thermodynamics often "approximates away" statistics in one of the first few steps because sufficient atoms/molecules are available so all that is needed is the mode of the distribution in question. Solve for MLE, and from then on statistics are basically gone from the calculation. (This thought was triggered by a discussion with a physical chemistriy prof saying he doesn't get the statistics we chemometricians do - and a colleague answering "But you do lecture statistical thermodynamics!?" and the prof then saying that that is completely different from what we do)
            – cbeleites
            Sep 10 at 10:34












            Yes, while many physical chemistry problems have been tackled with statistics (and maths in general), students mostly get shown the derivation only once before it's all packaged away as ready-to-use formulas. I wouldn't wholly blame the chemists for this though - the problems encountered in chemistry seem to be prone to going from trivial to analytically intractable just from a simple addition or change in assumptions. @Landon Carter I am currently between jobs so I will see if I can find more examples for you.
            – Ingolifs
            Sep 10 at 18:44





            Yes, while many physical chemistry problems have been tackled with statistics (and maths in general), students mostly get shown the derivation only once before it's all packaged away as ready-to-use formulas. I wouldn't wholly blame the chemists for this though - the problems encountered in chemistry seem to be prone to going from trivial to analytically intractable just from a simple addition or change in assumptions. @Landon Carter I am currently between jobs so I will see if I can find more examples for you.
            – Ingolifs
            Sep 10 at 18:44













            up vote
            0
            down vote













            I'm analytical chemist specialised in spectroscopy and chemometrics.



            Chemometrics is statistics for chemical questions/tasks/problems (similar to psychometrics, biometrics, etc.) and definitively a search term your friend should check out.




            works on the confluence of statistics and chemistry, of which he could not find many articles online.




            This may be because that intersection is a rather specialized field, and thus much smaller, than, say, organic chemistry.



            OTOH, it may also be due to not knowing the terminology to search for. Finding such search terms can be very difficult, particularly in these interdisciplinary and applied stats fields: many of them have their own terminology, as people from various disciplines may describe situations which are similar in their statistical aspects with very different terms. E.g. I'm currently looking into aspects of nested designs/data structures. Which I describe to chemists as hierarchical, to data base people as having 1 : n relationships, and which in the social sciences are known as clustered.



            A third possible reason is that at least over here in Germany, I'd estimate that far more such activities are going on in industry than in academia. Which may lead to lower priority for publication than implementation.




            • As @Ingolifs already pointed out, there's a whole lot of statistics in chemical process optimization, and that's related both to (chemical) analysis (PAT: process analytical technology) and synthesis.


            • The "Applying big data to chemistry and drug design" would be related to QSAR (quantitative structure-activity relationship)



            I do not mean the chemical statistics of laboratories, like mean, median, mode of results obtained from experimental data. I mean serious statistics and chemistry.




            I suspect that you do have a misconception here. In my experience, many of the simple analytical textbook situations have a habit of needing rather more serious statistics as soon as a little bit of reality happens. The textbook may be fine with a bit of mean and median and a hyperbolic confidence interval for a linear calibration function. With real life analytical data and its more complex structures of influencing factors and confounders as well as experimental constraints that do mess up any simple design for the experiment. E.g. I find myself today first using a mixed model to analyze the data I want to use as reference for calibration of some other measurements later on.



            So, applied does not equal easy (or "not serious") - if you take applied data seriously, you'll find that many assumptions made life in "more serious" theory easier do not hold in many application situations.



            I'd say that there's a huge amount of serious work still to be done in this area. At least I can say that astonishingly often the answer I get for my "this is the situation, we cannot use the easy case here, because of .... How can we approach this?" is: good question - please tell me once you know - there's no known solution so far.

            A colleague from stats/data analysis once complained about "us chemists" that we either have tasks that are no fun because the solution is obvious - or that so hard that they are impossible to solve ;-) (which I guess is pretty much the same for all those applied stats fields)






            share|cite|improve this answer


























              up vote
              0
              down vote













              I'm analytical chemist specialised in spectroscopy and chemometrics.



              Chemometrics is statistics for chemical questions/tasks/problems (similar to psychometrics, biometrics, etc.) and definitively a search term your friend should check out.




              works on the confluence of statistics and chemistry, of which he could not find many articles online.




              This may be because that intersection is a rather specialized field, and thus much smaller, than, say, organic chemistry.



              OTOH, it may also be due to not knowing the terminology to search for. Finding such search terms can be very difficult, particularly in these interdisciplinary and applied stats fields: many of them have their own terminology, as people from various disciplines may describe situations which are similar in their statistical aspects with very different terms. E.g. I'm currently looking into aspects of nested designs/data structures. Which I describe to chemists as hierarchical, to data base people as having 1 : n relationships, and which in the social sciences are known as clustered.



              A third possible reason is that at least over here in Germany, I'd estimate that far more such activities are going on in industry than in academia. Which may lead to lower priority for publication than implementation.




              • As @Ingolifs already pointed out, there's a whole lot of statistics in chemical process optimization, and that's related both to (chemical) analysis (PAT: process analytical technology) and synthesis.


              • The "Applying big data to chemistry and drug design" would be related to QSAR (quantitative structure-activity relationship)



              I do not mean the chemical statistics of laboratories, like mean, median, mode of results obtained from experimental data. I mean serious statistics and chemistry.




              I suspect that you do have a misconception here. In my experience, many of the simple analytical textbook situations have a habit of needing rather more serious statistics as soon as a little bit of reality happens. The textbook may be fine with a bit of mean and median and a hyperbolic confidence interval for a linear calibration function. With real life analytical data and its more complex structures of influencing factors and confounders as well as experimental constraints that do mess up any simple design for the experiment. E.g. I find myself today first using a mixed model to analyze the data I want to use as reference for calibration of some other measurements later on.



              So, applied does not equal easy (or "not serious") - if you take applied data seriously, you'll find that many assumptions made life in "more serious" theory easier do not hold in many application situations.



              I'd say that there's a huge amount of serious work still to be done in this area. At least I can say that astonishingly often the answer I get for my "this is the situation, we cannot use the easy case here, because of .... How can we approach this?" is: good question - please tell me once you know - there's no known solution so far.

              A colleague from stats/data analysis once complained about "us chemists" that we either have tasks that are no fun because the solution is obvious - or that so hard that they are impossible to solve ;-) (which I guess is pretty much the same for all those applied stats fields)






              share|cite|improve this answer
























                up vote
                0
                down vote










                up vote
                0
                down vote









                I'm analytical chemist specialised in spectroscopy and chemometrics.



                Chemometrics is statistics for chemical questions/tasks/problems (similar to psychometrics, biometrics, etc.) and definitively a search term your friend should check out.




                works on the confluence of statistics and chemistry, of which he could not find many articles online.




                This may be because that intersection is a rather specialized field, and thus much smaller, than, say, organic chemistry.



                OTOH, it may also be due to not knowing the terminology to search for. Finding such search terms can be very difficult, particularly in these interdisciplinary and applied stats fields: many of them have their own terminology, as people from various disciplines may describe situations which are similar in their statistical aspects with very different terms. E.g. I'm currently looking into aspects of nested designs/data structures. Which I describe to chemists as hierarchical, to data base people as having 1 : n relationships, and which in the social sciences are known as clustered.



                A third possible reason is that at least over here in Germany, I'd estimate that far more such activities are going on in industry than in academia. Which may lead to lower priority for publication than implementation.




                • As @Ingolifs already pointed out, there's a whole lot of statistics in chemical process optimization, and that's related both to (chemical) analysis (PAT: process analytical technology) and synthesis.


                • The "Applying big data to chemistry and drug design" would be related to QSAR (quantitative structure-activity relationship)



                I do not mean the chemical statistics of laboratories, like mean, median, mode of results obtained from experimental data. I mean serious statistics and chemistry.




                I suspect that you do have a misconception here. In my experience, many of the simple analytical textbook situations have a habit of needing rather more serious statistics as soon as a little bit of reality happens. The textbook may be fine with a bit of mean and median and a hyperbolic confidence interval for a linear calibration function. With real life analytical data and its more complex structures of influencing factors and confounders as well as experimental constraints that do mess up any simple design for the experiment. E.g. I find myself today first using a mixed model to analyze the data I want to use as reference for calibration of some other measurements later on.



                So, applied does not equal easy (or "not serious") - if you take applied data seriously, you'll find that many assumptions made life in "more serious" theory easier do not hold in many application situations.



                I'd say that there's a huge amount of serious work still to be done in this area. At least I can say that astonishingly often the answer I get for my "this is the situation, we cannot use the easy case here, because of .... How can we approach this?" is: good question - please tell me once you know - there's no known solution so far.

                A colleague from stats/data analysis once complained about "us chemists" that we either have tasks that are no fun because the solution is obvious - or that so hard that they are impossible to solve ;-) (which I guess is pretty much the same for all those applied stats fields)






                share|cite|improve this answer














                I'm analytical chemist specialised in spectroscopy and chemometrics.



                Chemometrics is statistics for chemical questions/tasks/problems (similar to psychometrics, biometrics, etc.) and definitively a search term your friend should check out.




                works on the confluence of statistics and chemistry, of which he could not find many articles online.




                This may be because that intersection is a rather specialized field, and thus much smaller, than, say, organic chemistry.



                OTOH, it may also be due to not knowing the terminology to search for. Finding such search terms can be very difficult, particularly in these interdisciplinary and applied stats fields: many of them have their own terminology, as people from various disciplines may describe situations which are similar in their statistical aspects with very different terms. E.g. I'm currently looking into aspects of nested designs/data structures. Which I describe to chemists as hierarchical, to data base people as having 1 : n relationships, and which in the social sciences are known as clustered.



                A third possible reason is that at least over here in Germany, I'd estimate that far more such activities are going on in industry than in academia. Which may lead to lower priority for publication than implementation.




                • As @Ingolifs already pointed out, there's a whole lot of statistics in chemical process optimization, and that's related both to (chemical) analysis (PAT: process analytical technology) and synthesis.


                • The "Applying big data to chemistry and drug design" would be related to QSAR (quantitative structure-activity relationship)



                I do not mean the chemical statistics of laboratories, like mean, median, mode of results obtained from experimental data. I mean serious statistics and chemistry.




                I suspect that you do have a misconception here. In my experience, many of the simple analytical textbook situations have a habit of needing rather more serious statistics as soon as a little bit of reality happens. The textbook may be fine with a bit of mean and median and a hyperbolic confidence interval for a linear calibration function. With real life analytical data and its more complex structures of influencing factors and confounders as well as experimental constraints that do mess up any simple design for the experiment. E.g. I find myself today first using a mixed model to analyze the data I want to use as reference for calibration of some other measurements later on.



                So, applied does not equal easy (or "not serious") - if you take applied data seriously, you'll find that many assumptions made life in "more serious" theory easier do not hold in many application situations.



                I'd say that there's a huge amount of serious work still to be done in this area. At least I can say that astonishingly often the answer I get for my "this is the situation, we cannot use the easy case here, because of .... How can we approach this?" is: good question - please tell me once you know - there's no known solution so far.

                A colleague from stats/data analysis once complained about "us chemists" that we either have tasks that are no fun because the solution is obvious - or that so hard that they are impossible to solve ;-) (which I guess is pretty much the same for all those applied stats fields)







                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                answered Sep 11 at 0:27


























                community wiki





                cbeleites




























                     

                    draft saved


                    draft discarded















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f366139%2fintersections-of-chemistry-and-statistics%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    這個網誌中的熱門文章

                    Why am i infinitely getting the same tweet with the Twitter Search API?

                    Is there any way to eliminate the singular point to solve this integral by hand or by approximations?

                    Strongly p-embedded subgroups and p-Sylow subgroups.