Minimum number of points for a good exponential curve fit

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












We are currently trying to fit data from a process that generates data that decays exponentially. We employ several techniques for fitting this exponential decay in including FFT analysis and a weighted least-squares algorithm. We are attempting to fit a lot of these decays every second (in some cases 8000 decays/s) on varied computing platforms.



We want to minimize our computational time so we now have the question as to how do we determine the minimum number of points required for a "good" fit (I know the term is ambiguous). Just to be clear, we aren't interested in how long the sample required for a good fit is; we want to be able to determine how many points for a given portion of a curve would be required for a good fit. Is there an explicit approach to determining this or will this require modeling?







share|cite|improve this question




















  • It probably depends on how complicated the function you are trying to fit is and how noisy the data is (e.g. how safe it would be to take a random subsample). What is the model you are fitting? And how many points do you usually get per decay?
    – user3658307
    Jul 28 '17 at 18:43














up vote
0
down vote

favorite












We are currently trying to fit data from a process that generates data that decays exponentially. We employ several techniques for fitting this exponential decay in including FFT analysis and a weighted least-squares algorithm. We are attempting to fit a lot of these decays every second (in some cases 8000 decays/s) on varied computing platforms.



We want to minimize our computational time so we now have the question as to how do we determine the minimum number of points required for a "good" fit (I know the term is ambiguous). Just to be clear, we aren't interested in how long the sample required for a good fit is; we want to be able to determine how many points for a given portion of a curve would be required for a good fit. Is there an explicit approach to determining this or will this require modeling?







share|cite|improve this question




















  • It probably depends on how complicated the function you are trying to fit is and how noisy the data is (e.g. how safe it would be to take a random subsample). What is the model you are fitting? And how many points do you usually get per decay?
    – user3658307
    Jul 28 '17 at 18:43












up vote
0
down vote

favorite









up vote
0
down vote

favorite











We are currently trying to fit data from a process that generates data that decays exponentially. We employ several techniques for fitting this exponential decay in including FFT analysis and a weighted least-squares algorithm. We are attempting to fit a lot of these decays every second (in some cases 8000 decays/s) on varied computing platforms.



We want to minimize our computational time so we now have the question as to how do we determine the minimum number of points required for a "good" fit (I know the term is ambiguous). Just to be clear, we aren't interested in how long the sample required for a good fit is; we want to be able to determine how many points for a given portion of a curve would be required for a good fit. Is there an explicit approach to determining this or will this require modeling?







share|cite|improve this question












We are currently trying to fit data from a process that generates data that decays exponentially. We employ several techniques for fitting this exponential decay in including FFT analysis and a weighted least-squares algorithm. We are attempting to fit a lot of these decays every second (in some cases 8000 decays/s) on varied computing platforms.



We want to minimize our computational time so we now have the question as to how do we determine the minimum number of points required for a "good" fit (I know the term is ambiguous). Just to be clear, we aren't interested in how long the sample required for a good fit is; we want to be able to determine how many points for a given portion of a curve would be required for a good fit. Is there an explicit approach to determining this or will this require modeling?









share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Jul 28 '17 at 17:20









cirrusio

1011




1011











  • It probably depends on how complicated the function you are trying to fit is and how noisy the data is (e.g. how safe it would be to take a random subsample). What is the model you are fitting? And how many points do you usually get per decay?
    – user3658307
    Jul 28 '17 at 18:43
















  • It probably depends on how complicated the function you are trying to fit is and how noisy the data is (e.g. how safe it would be to take a random subsample). What is the model you are fitting? And how many points do you usually get per decay?
    – user3658307
    Jul 28 '17 at 18:43















It probably depends on how complicated the function you are trying to fit is and how noisy the data is (e.g. how safe it would be to take a random subsample). What is the model you are fitting? And how many points do you usually get per decay?
– user3658307
Jul 28 '17 at 18:43




It probably depends on how complicated the function you are trying to fit is and how noisy the data is (e.g. how safe it would be to take a random subsample). What is the model you are fitting? And how many points do you usually get per decay?
– user3658307
Jul 28 '17 at 18:43










4 Answers
4






active

oldest

votes

















up vote
0
down vote













If the data is clean, you only need two points because there are only two degrees of freedom-the original rate and the decay time. You use more points when there is noise in the data to get a better estimate, so the number of points needed depends on how noisy the data is. You can linearize your fit by taking the logairithm, getting $log(textcounts(t))=log (textcounts at t=0)-text(decay rate*time)$ Now a linear least squares fit will give you estimates of the errors in the parameters. I would take a few curves and fit them each with lots of points. You can then take a random sample of the points and see how much the fit changes.






share|cite|improve this answer




















  • Thanks @RossMilikan. This is data from a real world process, so it is definitely noisy. The issue is that we have computational limits so we are trying to balance goodness of fit with our available computational power. Your answer actually restates the problem and what we are looking for is an answer to this question - is there a mathematical way of determining how our goodness of fit might change with the resolution of the points to fit? When do my returns start to diminish with increasing effort? Hopefully that is clearer.
    – cirrusio
    Aug 1 '17 at 15:48


















up vote
0
down vote













This is awfully close to what I'm working on right now! :)



The most important aspect is what the type of noise is. Is it mostly additive (i.e. your signal is $e^a-bx_i + N_i$) or is it multiplicative (i.e. $e^a-bx_i+N_i$)? I suppose that in practice you probably have a bit of both.



If it's multiplicative, then all your data is positively signed, you can easily get a solid fit by taking the logarithm of your data and doing a linear fit. That's very robust and of course very fast to fit!



Additive noise means that later samples will have a much larger relative error than the earlier samples, and so they'll be less useful to the fit.



In the likely scenario where additive noise is very very small for the first many samples (that is: would it be totally ridiculous for the first few samples to drop all the way to zero?), just take the first bunch of samples and fit that by the logarithm-and-linear-regression method. That method is very fast, so I can't really believe that computation time would actually be an issue for anything but the most extremely intense sample rates (> 1 million readings per second). How many samples should you take? That depends on the magnitude of the multiplicative noise, and how quickly the signal decays. As a general sort of estimate, if the signal has a half-life of around $N$ samples, then fitting $N$ samples should give a decent estimate and $5N$ should give quite a confident one.



Finally, one last note: If the additive noise is quite an issue, and you don't care about the magnitude $a$ of the signal (only the decay rate $b$), then you can do averaging on your signal to get a cleaner readout. That is, average each sample with a few before it. It will affect the magnitude $a$, but it will still be an exponential decay with the same decay rate $b$, and this can remove a lot of additive noise.






share|cite|improve this answer




















  • Thanks @AlexMeiburg. This one actually gets pretty close. But, I think it emphasizes how the digital and analog world diverge. You are correct that if the half-life is N, then 5N should be sufficient; the problem we are facing is that the N parameter (the resolution) is flexible. This is a measurement and we can actually change N for a given decay constant since we are making measurements as a function of time and we simply have to change the resolution of the measurement. Unfortunately, this resolution approaches 0 but we reach a computational limit.
    – cirrusio
    Aug 1 '17 at 15:40










  • Ah, gotcha. Well then, ask yourself about the noise level locally, and remember that taking N points will make noise drop by sqrt(N). For instance, if a typical measurement is ±10%, then you can think of taking 100 measurements in one half-life period as giving you a 1% accuracy reading of that. Subsequently, if you measured ~300 points over 300 half-lives, that seems like it should give you ~1% accuracy on the overall rate.
    – Alex Meiburg
    Aug 1 '17 at 18:34


















up vote
0
down vote













I'm not sure I understand what you're doing, but perhaps some elementary
relationships will be useful. If none of the answers here are sufficient,
then you might consider posting this question on our sister site.



Let $X_1, X_2, dots, X_n$ be a random sample from
$mathsfExp(mean=mu) equiv mathsfExp(rate=1/mu).$
Then an unbiased estimator of $mu$ is the sample mean $A = bar X.$
An empirical cumulative distribution function (ECDF) estimates the
CDF of the population distribution.



Because $A/mu sim mathsfGamma(shape = n, rate = n),$ we can find
constants $L$ and $U$ that cut off 2.5% of the probability from the
lower and upper tails, respectively, of this distribution so that
$$P(L < A/mu < U) = P(A/U < mu < A/L) = .95$$
and a 95% confidence interval for $mu$ is $(A/U, A/L).$



Below, I simulate a sample of size $n = 10$ from $mathsfExp(mu = 3).$
The figure shows the ECDF of the sample, the CDF for
$mathsfExp(A)$ (dotted red curve) and the CDF for the population
distribution $mathsfExp(3).$ Some simulations showed better fit
and others showed worse fit. For the simulation shown, $A = 3.48$ and the CI for $mu$
is $(2.04, 7.25)$



enter image description here



Here is elementary R code (in R, the second argument of rexp is the
rate parameter):



n = 10; mu = 3; x = rexp(n, 1/mu); a = mean(x)
plot(ecdf(x), ylab="CDF", main="Empirical CDF with Estimated (red) and Exact CDFs")
curve(pexp(x, 1/a), lwd=3, lty="dashed", col="red", n=1001, add=T)
curve(pexp(x, 1/mu), col="blue", n=1001, add=T)
a; a/qgamma(c(.975,.025), n, n)
## 3.479 # est of pop mean
## 2.036312 7.254886 # CI for pop mean


For a sample of $n = 100,$ one simulation gave $A = 3.31$ and CI $(2.74, 4.06).$
[I had to do several simulations to get one that resulted in visible separation among
the ECDF and the two CDFs.]



enter image description here



I'm wondering whether such elementary methods of approximating the actual
CDF by the ECDF or the estimated CDF would be of any use in your project.






share|cite|improve this answer





























    up vote
    -1
    down vote













    As stated in the original post, the function is a single exponential.



    Also, as stated in the original post, we are trying to DETERMINE the number of points needed. As a rough guideline, we sample 100s to 1000s of points for each curve currently. But maybe we only need 10 or 50 or 100.






    share|cite|improve this answer




















    • This is not an answer. Are you the original poster? If so, the FAQ will show you how to merge the accounts, or you can edit the original post using the edit button below.
      – Ross Millikan
      Jul 28 '17 at 19:02










    • @RossMillikan - that was not the original poster but a colleague. This should have been a comment for clarification not an answer.
      – cirrusio
      Aug 1 '17 at 15:41











    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "69"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2375011%2fminimum-number-of-points-for-a-good-exponential-curve-fit%23new-answer', 'question_page');

    );

    Post as a guest






























    4 Answers
    4






    active

    oldest

    votes








    4 Answers
    4






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    If the data is clean, you only need two points because there are only two degrees of freedom-the original rate and the decay time. You use more points when there is noise in the data to get a better estimate, so the number of points needed depends on how noisy the data is. You can linearize your fit by taking the logairithm, getting $log(textcounts(t))=log (textcounts at t=0)-text(decay rate*time)$ Now a linear least squares fit will give you estimates of the errors in the parameters. I would take a few curves and fit them each with lots of points. You can then take a random sample of the points and see how much the fit changes.






    share|cite|improve this answer




















    • Thanks @RossMilikan. This is data from a real world process, so it is definitely noisy. The issue is that we have computational limits so we are trying to balance goodness of fit with our available computational power. Your answer actually restates the problem and what we are looking for is an answer to this question - is there a mathematical way of determining how our goodness of fit might change with the resolution of the points to fit? When do my returns start to diminish with increasing effort? Hopefully that is clearer.
      – cirrusio
      Aug 1 '17 at 15:48















    up vote
    0
    down vote













    If the data is clean, you only need two points because there are only two degrees of freedom-the original rate and the decay time. You use more points when there is noise in the data to get a better estimate, so the number of points needed depends on how noisy the data is. You can linearize your fit by taking the logairithm, getting $log(textcounts(t))=log (textcounts at t=0)-text(decay rate*time)$ Now a linear least squares fit will give you estimates of the errors in the parameters. I would take a few curves and fit them each with lots of points. You can then take a random sample of the points and see how much the fit changes.






    share|cite|improve this answer




















    • Thanks @RossMilikan. This is data from a real world process, so it is definitely noisy. The issue is that we have computational limits so we are trying to balance goodness of fit with our available computational power. Your answer actually restates the problem and what we are looking for is an answer to this question - is there a mathematical way of determining how our goodness of fit might change with the resolution of the points to fit? When do my returns start to diminish with increasing effort? Hopefully that is clearer.
      – cirrusio
      Aug 1 '17 at 15:48













    up vote
    0
    down vote










    up vote
    0
    down vote









    If the data is clean, you only need two points because there are only two degrees of freedom-the original rate and the decay time. You use more points when there is noise in the data to get a better estimate, so the number of points needed depends on how noisy the data is. You can linearize your fit by taking the logairithm, getting $log(textcounts(t))=log (textcounts at t=0)-text(decay rate*time)$ Now a linear least squares fit will give you estimates of the errors in the parameters. I would take a few curves and fit them each with lots of points. You can then take a random sample of the points and see how much the fit changes.






    share|cite|improve this answer












    If the data is clean, you only need two points because there are only two degrees of freedom-the original rate and the decay time. You use more points when there is noise in the data to get a better estimate, so the number of points needed depends on how noisy the data is. You can linearize your fit by taking the logairithm, getting $log(textcounts(t))=log (textcounts at t=0)-text(decay rate*time)$ Now a linear least squares fit will give you estimates of the errors in the parameters. I would take a few curves and fit them each with lots of points. You can then take a random sample of the points and see how much the fit changes.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Jul 28 '17 at 19:12









    Ross Millikan

    277k21187354




    277k21187354











    • Thanks @RossMilikan. This is data from a real world process, so it is definitely noisy. The issue is that we have computational limits so we are trying to balance goodness of fit with our available computational power. Your answer actually restates the problem and what we are looking for is an answer to this question - is there a mathematical way of determining how our goodness of fit might change with the resolution of the points to fit? When do my returns start to diminish with increasing effort? Hopefully that is clearer.
      – cirrusio
      Aug 1 '17 at 15:48

















    • Thanks @RossMilikan. This is data from a real world process, so it is definitely noisy. The issue is that we have computational limits so we are trying to balance goodness of fit with our available computational power. Your answer actually restates the problem and what we are looking for is an answer to this question - is there a mathematical way of determining how our goodness of fit might change with the resolution of the points to fit? When do my returns start to diminish with increasing effort? Hopefully that is clearer.
      – cirrusio
      Aug 1 '17 at 15:48
















    Thanks @RossMilikan. This is data from a real world process, so it is definitely noisy. The issue is that we have computational limits so we are trying to balance goodness of fit with our available computational power. Your answer actually restates the problem and what we are looking for is an answer to this question - is there a mathematical way of determining how our goodness of fit might change with the resolution of the points to fit? When do my returns start to diminish with increasing effort? Hopefully that is clearer.
    – cirrusio
    Aug 1 '17 at 15:48





    Thanks @RossMilikan. This is data from a real world process, so it is definitely noisy. The issue is that we have computational limits so we are trying to balance goodness of fit with our available computational power. Your answer actually restates the problem and what we are looking for is an answer to this question - is there a mathematical way of determining how our goodness of fit might change with the resolution of the points to fit? When do my returns start to diminish with increasing effort? Hopefully that is clearer.
    – cirrusio
    Aug 1 '17 at 15:48











    up vote
    0
    down vote













    This is awfully close to what I'm working on right now! :)



    The most important aspect is what the type of noise is. Is it mostly additive (i.e. your signal is $e^a-bx_i + N_i$) or is it multiplicative (i.e. $e^a-bx_i+N_i$)? I suppose that in practice you probably have a bit of both.



    If it's multiplicative, then all your data is positively signed, you can easily get a solid fit by taking the logarithm of your data and doing a linear fit. That's very robust and of course very fast to fit!



    Additive noise means that later samples will have a much larger relative error than the earlier samples, and so they'll be less useful to the fit.



    In the likely scenario where additive noise is very very small for the first many samples (that is: would it be totally ridiculous for the first few samples to drop all the way to zero?), just take the first bunch of samples and fit that by the logarithm-and-linear-regression method. That method is very fast, so I can't really believe that computation time would actually be an issue for anything but the most extremely intense sample rates (> 1 million readings per second). How many samples should you take? That depends on the magnitude of the multiplicative noise, and how quickly the signal decays. As a general sort of estimate, if the signal has a half-life of around $N$ samples, then fitting $N$ samples should give a decent estimate and $5N$ should give quite a confident one.



    Finally, one last note: If the additive noise is quite an issue, and you don't care about the magnitude $a$ of the signal (only the decay rate $b$), then you can do averaging on your signal to get a cleaner readout. That is, average each sample with a few before it. It will affect the magnitude $a$, but it will still be an exponential decay with the same decay rate $b$, and this can remove a lot of additive noise.






    share|cite|improve this answer




















    • Thanks @AlexMeiburg. This one actually gets pretty close. But, I think it emphasizes how the digital and analog world diverge. You are correct that if the half-life is N, then 5N should be sufficient; the problem we are facing is that the N parameter (the resolution) is flexible. This is a measurement and we can actually change N for a given decay constant since we are making measurements as a function of time and we simply have to change the resolution of the measurement. Unfortunately, this resolution approaches 0 but we reach a computational limit.
      – cirrusio
      Aug 1 '17 at 15:40










    • Ah, gotcha. Well then, ask yourself about the noise level locally, and remember that taking N points will make noise drop by sqrt(N). For instance, if a typical measurement is ±10%, then you can think of taking 100 measurements in one half-life period as giving you a 1% accuracy reading of that. Subsequently, if you measured ~300 points over 300 half-lives, that seems like it should give you ~1% accuracy on the overall rate.
      – Alex Meiburg
      Aug 1 '17 at 18:34















    up vote
    0
    down vote













    This is awfully close to what I'm working on right now! :)



    The most important aspect is what the type of noise is. Is it mostly additive (i.e. your signal is $e^a-bx_i + N_i$) or is it multiplicative (i.e. $e^a-bx_i+N_i$)? I suppose that in practice you probably have a bit of both.



    If it's multiplicative, then all your data is positively signed, you can easily get a solid fit by taking the logarithm of your data and doing a linear fit. That's very robust and of course very fast to fit!



    Additive noise means that later samples will have a much larger relative error than the earlier samples, and so they'll be less useful to the fit.



    In the likely scenario where additive noise is very very small for the first many samples (that is: would it be totally ridiculous for the first few samples to drop all the way to zero?), just take the first bunch of samples and fit that by the logarithm-and-linear-regression method. That method is very fast, so I can't really believe that computation time would actually be an issue for anything but the most extremely intense sample rates (> 1 million readings per second). How many samples should you take? That depends on the magnitude of the multiplicative noise, and how quickly the signal decays. As a general sort of estimate, if the signal has a half-life of around $N$ samples, then fitting $N$ samples should give a decent estimate and $5N$ should give quite a confident one.



    Finally, one last note: If the additive noise is quite an issue, and you don't care about the magnitude $a$ of the signal (only the decay rate $b$), then you can do averaging on your signal to get a cleaner readout. That is, average each sample with a few before it. It will affect the magnitude $a$, but it will still be an exponential decay with the same decay rate $b$, and this can remove a lot of additive noise.






    share|cite|improve this answer




















    • Thanks @AlexMeiburg. This one actually gets pretty close. But, I think it emphasizes how the digital and analog world diverge. You are correct that if the half-life is N, then 5N should be sufficient; the problem we are facing is that the N parameter (the resolution) is flexible. This is a measurement and we can actually change N for a given decay constant since we are making measurements as a function of time and we simply have to change the resolution of the measurement. Unfortunately, this resolution approaches 0 but we reach a computational limit.
      – cirrusio
      Aug 1 '17 at 15:40










    • Ah, gotcha. Well then, ask yourself about the noise level locally, and remember that taking N points will make noise drop by sqrt(N). For instance, if a typical measurement is ±10%, then you can think of taking 100 measurements in one half-life period as giving you a 1% accuracy reading of that. Subsequently, if you measured ~300 points over 300 half-lives, that seems like it should give you ~1% accuracy on the overall rate.
      – Alex Meiburg
      Aug 1 '17 at 18:34













    up vote
    0
    down vote










    up vote
    0
    down vote









    This is awfully close to what I'm working on right now! :)



    The most important aspect is what the type of noise is. Is it mostly additive (i.e. your signal is $e^a-bx_i + N_i$) or is it multiplicative (i.e. $e^a-bx_i+N_i$)? I suppose that in practice you probably have a bit of both.



    If it's multiplicative, then all your data is positively signed, you can easily get a solid fit by taking the logarithm of your data and doing a linear fit. That's very robust and of course very fast to fit!



    Additive noise means that later samples will have a much larger relative error than the earlier samples, and so they'll be less useful to the fit.



    In the likely scenario where additive noise is very very small for the first many samples (that is: would it be totally ridiculous for the first few samples to drop all the way to zero?), just take the first bunch of samples and fit that by the logarithm-and-linear-regression method. That method is very fast, so I can't really believe that computation time would actually be an issue for anything but the most extremely intense sample rates (> 1 million readings per second). How many samples should you take? That depends on the magnitude of the multiplicative noise, and how quickly the signal decays. As a general sort of estimate, if the signal has a half-life of around $N$ samples, then fitting $N$ samples should give a decent estimate and $5N$ should give quite a confident one.



    Finally, one last note: If the additive noise is quite an issue, and you don't care about the magnitude $a$ of the signal (only the decay rate $b$), then you can do averaging on your signal to get a cleaner readout. That is, average each sample with a few before it. It will affect the magnitude $a$, but it will still be an exponential decay with the same decay rate $b$, and this can remove a lot of additive noise.






    share|cite|improve this answer












    This is awfully close to what I'm working on right now! :)



    The most important aspect is what the type of noise is. Is it mostly additive (i.e. your signal is $e^a-bx_i + N_i$) or is it multiplicative (i.e. $e^a-bx_i+N_i$)? I suppose that in practice you probably have a bit of both.



    If it's multiplicative, then all your data is positively signed, you can easily get a solid fit by taking the logarithm of your data and doing a linear fit. That's very robust and of course very fast to fit!



    Additive noise means that later samples will have a much larger relative error than the earlier samples, and so they'll be less useful to the fit.



    In the likely scenario where additive noise is very very small for the first many samples (that is: would it be totally ridiculous for the first few samples to drop all the way to zero?), just take the first bunch of samples and fit that by the logarithm-and-linear-regression method. That method is very fast, so I can't really believe that computation time would actually be an issue for anything but the most extremely intense sample rates (> 1 million readings per second). How many samples should you take? That depends on the magnitude of the multiplicative noise, and how quickly the signal decays. As a general sort of estimate, if the signal has a half-life of around $N$ samples, then fitting $N$ samples should give a decent estimate and $5N$ should give quite a confident one.



    Finally, one last note: If the additive noise is quite an issue, and you don't care about the magnitude $a$ of the signal (only the decay rate $b$), then you can do averaging on your signal to get a cleaner readout. That is, average each sample with a few before it. It will affect the magnitude $a$, but it will still be an exponential decay with the same decay rate $b$, and this can remove a lot of additive noise.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Jul 28 '17 at 19:19









    Alex Meiburg

    1,795516




    1,795516











    • Thanks @AlexMeiburg. This one actually gets pretty close. But, I think it emphasizes how the digital and analog world diverge. You are correct that if the half-life is N, then 5N should be sufficient; the problem we are facing is that the N parameter (the resolution) is flexible. This is a measurement and we can actually change N for a given decay constant since we are making measurements as a function of time and we simply have to change the resolution of the measurement. Unfortunately, this resolution approaches 0 but we reach a computational limit.
      – cirrusio
      Aug 1 '17 at 15:40










    • Ah, gotcha. Well then, ask yourself about the noise level locally, and remember that taking N points will make noise drop by sqrt(N). For instance, if a typical measurement is ±10%, then you can think of taking 100 measurements in one half-life period as giving you a 1% accuracy reading of that. Subsequently, if you measured ~300 points over 300 half-lives, that seems like it should give you ~1% accuracy on the overall rate.
      – Alex Meiburg
      Aug 1 '17 at 18:34

















    • Thanks @AlexMeiburg. This one actually gets pretty close. But, I think it emphasizes how the digital and analog world diverge. You are correct that if the half-life is N, then 5N should be sufficient; the problem we are facing is that the N parameter (the resolution) is flexible. This is a measurement and we can actually change N for a given decay constant since we are making measurements as a function of time and we simply have to change the resolution of the measurement. Unfortunately, this resolution approaches 0 but we reach a computational limit.
      – cirrusio
      Aug 1 '17 at 15:40










    • Ah, gotcha. Well then, ask yourself about the noise level locally, and remember that taking N points will make noise drop by sqrt(N). For instance, if a typical measurement is ±10%, then you can think of taking 100 measurements in one half-life period as giving you a 1% accuracy reading of that. Subsequently, if you measured ~300 points over 300 half-lives, that seems like it should give you ~1% accuracy on the overall rate.
      – Alex Meiburg
      Aug 1 '17 at 18:34
















    Thanks @AlexMeiburg. This one actually gets pretty close. But, I think it emphasizes how the digital and analog world diverge. You are correct that if the half-life is N, then 5N should be sufficient; the problem we are facing is that the N parameter (the resolution) is flexible. This is a measurement and we can actually change N for a given decay constant since we are making measurements as a function of time and we simply have to change the resolution of the measurement. Unfortunately, this resolution approaches 0 but we reach a computational limit.
    – cirrusio
    Aug 1 '17 at 15:40




    Thanks @AlexMeiburg. This one actually gets pretty close. But, I think it emphasizes how the digital and analog world diverge. You are correct that if the half-life is N, then 5N should be sufficient; the problem we are facing is that the N parameter (the resolution) is flexible. This is a measurement and we can actually change N for a given decay constant since we are making measurements as a function of time and we simply have to change the resolution of the measurement. Unfortunately, this resolution approaches 0 but we reach a computational limit.
    – cirrusio
    Aug 1 '17 at 15:40












    Ah, gotcha. Well then, ask yourself about the noise level locally, and remember that taking N points will make noise drop by sqrt(N). For instance, if a typical measurement is ±10%, then you can think of taking 100 measurements in one half-life period as giving you a 1% accuracy reading of that. Subsequently, if you measured ~300 points over 300 half-lives, that seems like it should give you ~1% accuracy on the overall rate.
    – Alex Meiburg
    Aug 1 '17 at 18:34





    Ah, gotcha. Well then, ask yourself about the noise level locally, and remember that taking N points will make noise drop by sqrt(N). For instance, if a typical measurement is ±10%, then you can think of taking 100 measurements in one half-life period as giving you a 1% accuracy reading of that. Subsequently, if you measured ~300 points over 300 half-lives, that seems like it should give you ~1% accuracy on the overall rate.
    – Alex Meiburg
    Aug 1 '17 at 18:34











    up vote
    0
    down vote













    I'm not sure I understand what you're doing, but perhaps some elementary
    relationships will be useful. If none of the answers here are sufficient,
    then you might consider posting this question on our sister site.



    Let $X_1, X_2, dots, X_n$ be a random sample from
    $mathsfExp(mean=mu) equiv mathsfExp(rate=1/mu).$
    Then an unbiased estimator of $mu$ is the sample mean $A = bar X.$
    An empirical cumulative distribution function (ECDF) estimates the
    CDF of the population distribution.



    Because $A/mu sim mathsfGamma(shape = n, rate = n),$ we can find
    constants $L$ and $U$ that cut off 2.5% of the probability from the
    lower and upper tails, respectively, of this distribution so that
    $$P(L < A/mu < U) = P(A/U < mu < A/L) = .95$$
    and a 95% confidence interval for $mu$ is $(A/U, A/L).$



    Below, I simulate a sample of size $n = 10$ from $mathsfExp(mu = 3).$
    The figure shows the ECDF of the sample, the CDF for
    $mathsfExp(A)$ (dotted red curve) and the CDF for the population
    distribution $mathsfExp(3).$ Some simulations showed better fit
    and others showed worse fit. For the simulation shown, $A = 3.48$ and the CI for $mu$
    is $(2.04, 7.25)$



    enter image description here



    Here is elementary R code (in R, the second argument of rexp is the
    rate parameter):



    n = 10; mu = 3; x = rexp(n, 1/mu); a = mean(x)
    plot(ecdf(x), ylab="CDF", main="Empirical CDF with Estimated (red) and Exact CDFs")
    curve(pexp(x, 1/a), lwd=3, lty="dashed", col="red", n=1001, add=T)
    curve(pexp(x, 1/mu), col="blue", n=1001, add=T)
    a; a/qgamma(c(.975,.025), n, n)
    ## 3.479 # est of pop mean
    ## 2.036312 7.254886 # CI for pop mean


    For a sample of $n = 100,$ one simulation gave $A = 3.31$ and CI $(2.74, 4.06).$
    [I had to do several simulations to get one that resulted in visible separation among
    the ECDF and the two CDFs.]



    enter image description here



    I'm wondering whether such elementary methods of approximating the actual
    CDF by the ECDF or the estimated CDF would be of any use in your project.






    share|cite|improve this answer


























      up vote
      0
      down vote













      I'm not sure I understand what you're doing, but perhaps some elementary
      relationships will be useful. If none of the answers here are sufficient,
      then you might consider posting this question on our sister site.



      Let $X_1, X_2, dots, X_n$ be a random sample from
      $mathsfExp(mean=mu) equiv mathsfExp(rate=1/mu).$
      Then an unbiased estimator of $mu$ is the sample mean $A = bar X.$
      An empirical cumulative distribution function (ECDF) estimates the
      CDF of the population distribution.



      Because $A/mu sim mathsfGamma(shape = n, rate = n),$ we can find
      constants $L$ and $U$ that cut off 2.5% of the probability from the
      lower and upper tails, respectively, of this distribution so that
      $$P(L < A/mu < U) = P(A/U < mu < A/L) = .95$$
      and a 95% confidence interval for $mu$ is $(A/U, A/L).$



      Below, I simulate a sample of size $n = 10$ from $mathsfExp(mu = 3).$
      The figure shows the ECDF of the sample, the CDF for
      $mathsfExp(A)$ (dotted red curve) and the CDF for the population
      distribution $mathsfExp(3).$ Some simulations showed better fit
      and others showed worse fit. For the simulation shown, $A = 3.48$ and the CI for $mu$
      is $(2.04, 7.25)$



      enter image description here



      Here is elementary R code (in R, the second argument of rexp is the
      rate parameter):



      n = 10; mu = 3; x = rexp(n, 1/mu); a = mean(x)
      plot(ecdf(x), ylab="CDF", main="Empirical CDF with Estimated (red) and Exact CDFs")
      curve(pexp(x, 1/a), lwd=3, lty="dashed", col="red", n=1001, add=T)
      curve(pexp(x, 1/mu), col="blue", n=1001, add=T)
      a; a/qgamma(c(.975,.025), n, n)
      ## 3.479 # est of pop mean
      ## 2.036312 7.254886 # CI for pop mean


      For a sample of $n = 100,$ one simulation gave $A = 3.31$ and CI $(2.74, 4.06).$
      [I had to do several simulations to get one that resulted in visible separation among
      the ECDF and the two CDFs.]



      enter image description here



      I'm wondering whether such elementary methods of approximating the actual
      CDF by the ECDF or the estimated CDF would be of any use in your project.






      share|cite|improve this answer
























        up vote
        0
        down vote










        up vote
        0
        down vote









        I'm not sure I understand what you're doing, but perhaps some elementary
        relationships will be useful. If none of the answers here are sufficient,
        then you might consider posting this question on our sister site.



        Let $X_1, X_2, dots, X_n$ be a random sample from
        $mathsfExp(mean=mu) equiv mathsfExp(rate=1/mu).$
        Then an unbiased estimator of $mu$ is the sample mean $A = bar X.$
        An empirical cumulative distribution function (ECDF) estimates the
        CDF of the population distribution.



        Because $A/mu sim mathsfGamma(shape = n, rate = n),$ we can find
        constants $L$ and $U$ that cut off 2.5% of the probability from the
        lower and upper tails, respectively, of this distribution so that
        $$P(L < A/mu < U) = P(A/U < mu < A/L) = .95$$
        and a 95% confidence interval for $mu$ is $(A/U, A/L).$



        Below, I simulate a sample of size $n = 10$ from $mathsfExp(mu = 3).$
        The figure shows the ECDF of the sample, the CDF for
        $mathsfExp(A)$ (dotted red curve) and the CDF for the population
        distribution $mathsfExp(3).$ Some simulations showed better fit
        and others showed worse fit. For the simulation shown, $A = 3.48$ and the CI for $mu$
        is $(2.04, 7.25)$



        enter image description here



        Here is elementary R code (in R, the second argument of rexp is the
        rate parameter):



        n = 10; mu = 3; x = rexp(n, 1/mu); a = mean(x)
        plot(ecdf(x), ylab="CDF", main="Empirical CDF with Estimated (red) and Exact CDFs")
        curve(pexp(x, 1/a), lwd=3, lty="dashed", col="red", n=1001, add=T)
        curve(pexp(x, 1/mu), col="blue", n=1001, add=T)
        a; a/qgamma(c(.975,.025), n, n)
        ## 3.479 # est of pop mean
        ## 2.036312 7.254886 # CI for pop mean


        For a sample of $n = 100,$ one simulation gave $A = 3.31$ and CI $(2.74, 4.06).$
        [I had to do several simulations to get one that resulted in visible separation among
        the ECDF and the two CDFs.]



        enter image description here



        I'm wondering whether such elementary methods of approximating the actual
        CDF by the ECDF or the estimated CDF would be of any use in your project.






        share|cite|improve this answer














        I'm not sure I understand what you're doing, but perhaps some elementary
        relationships will be useful. If none of the answers here are sufficient,
        then you might consider posting this question on our sister site.



        Let $X_1, X_2, dots, X_n$ be a random sample from
        $mathsfExp(mean=mu) equiv mathsfExp(rate=1/mu).$
        Then an unbiased estimator of $mu$ is the sample mean $A = bar X.$
        An empirical cumulative distribution function (ECDF) estimates the
        CDF of the population distribution.



        Because $A/mu sim mathsfGamma(shape = n, rate = n),$ we can find
        constants $L$ and $U$ that cut off 2.5% of the probability from the
        lower and upper tails, respectively, of this distribution so that
        $$P(L < A/mu < U) = P(A/U < mu < A/L) = .95$$
        and a 95% confidence interval for $mu$ is $(A/U, A/L).$



        Below, I simulate a sample of size $n = 10$ from $mathsfExp(mu = 3).$
        The figure shows the ECDF of the sample, the CDF for
        $mathsfExp(A)$ (dotted red curve) and the CDF for the population
        distribution $mathsfExp(3).$ Some simulations showed better fit
        and others showed worse fit. For the simulation shown, $A = 3.48$ and the CI for $mu$
        is $(2.04, 7.25)$



        enter image description here



        Here is elementary R code (in R, the second argument of rexp is the
        rate parameter):



        n = 10; mu = 3; x = rexp(n, 1/mu); a = mean(x)
        plot(ecdf(x), ylab="CDF", main="Empirical CDF with Estimated (red) and Exact CDFs")
        curve(pexp(x, 1/a), lwd=3, lty="dashed", col="red", n=1001, add=T)
        curve(pexp(x, 1/mu), col="blue", n=1001, add=T)
        a; a/qgamma(c(.975,.025), n, n)
        ## 3.479 # est of pop mean
        ## 2.036312 7.254886 # CI for pop mean


        For a sample of $n = 100,$ one simulation gave $A = 3.31$ and CI $(2.74, 4.06).$
        [I had to do several simulations to get one that resulted in visible separation among
        the ECDF and the two CDFs.]



        enter image description here



        I'm wondering whether such elementary methods of approximating the actual
        CDF by the ECDF or the estimated CDF would be of any use in your project.







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Jul 29 '17 at 5:03

























        answered Jul 29 '17 at 4:56









        BruceET

        33.6k71440




        33.6k71440




















            up vote
            -1
            down vote













            As stated in the original post, the function is a single exponential.



            Also, as stated in the original post, we are trying to DETERMINE the number of points needed. As a rough guideline, we sample 100s to 1000s of points for each curve currently. But maybe we only need 10 or 50 or 100.






            share|cite|improve this answer




















            • This is not an answer. Are you the original poster? If so, the FAQ will show you how to merge the accounts, or you can edit the original post using the edit button below.
              – Ross Millikan
              Jul 28 '17 at 19:02










            • @RossMillikan - that was not the original poster but a colleague. This should have been a comment for clarification not an answer.
              – cirrusio
              Aug 1 '17 at 15:41















            up vote
            -1
            down vote













            As stated in the original post, the function is a single exponential.



            Also, as stated in the original post, we are trying to DETERMINE the number of points needed. As a rough guideline, we sample 100s to 1000s of points for each curve currently. But maybe we only need 10 or 50 or 100.






            share|cite|improve this answer




















            • This is not an answer. Are you the original poster? If so, the FAQ will show you how to merge the accounts, or you can edit the original post using the edit button below.
              – Ross Millikan
              Jul 28 '17 at 19:02










            • @RossMillikan - that was not the original poster but a colleague. This should have been a comment for clarification not an answer.
              – cirrusio
              Aug 1 '17 at 15:41













            up vote
            -1
            down vote










            up vote
            -1
            down vote









            As stated in the original post, the function is a single exponential.



            Also, as stated in the original post, we are trying to DETERMINE the number of points needed. As a rough guideline, we sample 100s to 1000s of points for each curve currently. But maybe we only need 10 or 50 or 100.






            share|cite|improve this answer












            As stated in the original post, the function is a single exponential.



            Also, as stated in the original post, we are trying to DETERMINE the number of points needed. As a rough guideline, we sample 100s to 1000s of points for each curve currently. But maybe we only need 10 or 50 or 100.







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered Jul 28 '17 at 18:59









            TDG

            1




            1











            • This is not an answer. Are you the original poster? If so, the FAQ will show you how to merge the accounts, or you can edit the original post using the edit button below.
              – Ross Millikan
              Jul 28 '17 at 19:02










            • @RossMillikan - that was not the original poster but a colleague. This should have been a comment for clarification not an answer.
              – cirrusio
              Aug 1 '17 at 15:41

















            • This is not an answer. Are you the original poster? If so, the FAQ will show you how to merge the accounts, or you can edit the original post using the edit button below.
              – Ross Millikan
              Jul 28 '17 at 19:02










            • @RossMillikan - that was not the original poster but a colleague. This should have been a comment for clarification not an answer.
              – cirrusio
              Aug 1 '17 at 15:41
















            This is not an answer. Are you the original poster? If so, the FAQ will show you how to merge the accounts, or you can edit the original post using the edit button below.
            – Ross Millikan
            Jul 28 '17 at 19:02




            This is not an answer. Are you the original poster? If so, the FAQ will show you how to merge the accounts, or you can edit the original post using the edit button below.
            – Ross Millikan
            Jul 28 '17 at 19:02












            @RossMillikan - that was not the original poster but a colleague. This should have been a comment for clarification not an answer.
            – cirrusio
            Aug 1 '17 at 15:41





            @RossMillikan - that was not the original poster but a colleague. This should have been a comment for clarification not an answer.
            – cirrusio
            Aug 1 '17 at 15:41













             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2375011%2fminimum-number-of-points-for-a-good-exponential-curve-fit%23new-answer', 'question_page');

            );

            Post as a guest













































































            這個網誌中的熱門文章

            tkz-euclide: tkzDrawCircle[R] not working

            How to combine Bézier curves to a surface?

            1st Magritte Awards