In this chapter, we review basic concepts from probability theory and computational statistics that are fundamental to evolutionary genomics. We provide a very basic introduction to statistical modeling and discuss general principles, including maximum likelihood and Bayesian inference. Markov chains, hidden Markov models, and Bayesian network models are introduced in more detail as they occur frequently and in many variations in genomics applications. In particular, we discuss efficient inference algorithms and methods for learning these models from partially observed data. Several simple examples are given throughout the text, some of which provide the basis for models that are discussed in more detail in subsequent chapters.
Material removal in rotary ultrasonic machining (RUM) of hard-brittle material is an abnormal complicated process, which involves the combination effects of the numerous abrasive grains with the random distributions in the dimensions and penetration depths. These stochastic characteristics result in the evident differences in the extrusion loading between the material and each individual grain, and their aggregate effects serve to significantly affect the cutting force of the diamond tool. However, few mechanistic prediction models of the cutting force have incorporated in the random distribution features of the abrasive grains, restricting the current optimization methods for the reduction of the cutting force during the formal RUM process. Giving consideration to the abrasive processing kinematics and their distribution features on the tool end-face, the number of the effective grains together with their penetration depths was calculated utilizing the probability statistics. Subsequently, the novel theoretical model of the cutting force was established by incorporating the gaussian distribution characteristics of the grain size and their penetration depths. Afterward, the confirmatory experiments were performed for the validation of the proposed cutting force model, revealing that the predicted results were accordant well with the experimental measurements. Furthermore, it was found that the number of the active abrasive grains accounted for 2.972% of the total number on the tool end-face at the specific processing parameters. Additionally, the mechanistic predictions of the developed model represented that the cutting force depicting an irregular decreasing trend with the grain dimension increasing, which was attributed to the coupling effects between the grain size and their number.
In comparison with optical images, a Synthetic Aperture Radar (SAR) image has many defects, such as low resolution, strong noise interference and random distribution of the target, which increases the false alarm rate of traditional detection methods. To improve the detection accuracy of the SAR image, a novel detection method is proposed based on regional probability statistics and saliency analysis. A saliency analysis model based on dense and sparse reconstruction (DSR) is reconstructed to locate the target precisely. Firstly, the regional probability of the SAR image is estimated to extract the background region. And then, the extracted background sub-blocks are clustered and employed to replace the corresponding background template set of the DSR model. Subsequently, the reconstructed DSR model is used to extract the target, and the detection accuracy of the proposed method is enhanced greatly. Compared with the constant false alarm rate (CFAR)-based detection method, the proposed method can achieve a high detection accuracy and protect the edges of the SAR image.
A recent report demonstrated that 8-month-olds can segment a continuous stream of speech syllables, containing no acoustic or prosodic cues to word boundaries, into wordlike units after only 2 min of listening experience (Saffran, Aslin, & Newport, 1996). Thus, a powerful learning mechanism capable of extracting statistical information from fluent speech is available early in development. The present study extends these results by documenting the particular type of statistical computationtransitional (conditional) probabilityused by infants to solve this word-segmentation task. An artificial language corpus, consisting of a continuous stream of trisyllabic nonsense words, was presented to 8-month-olds for 3 min. A postfamiliarization test compared the infants' responses to words versus part-words (trisyllabic sequences spanning word boundaries). The corpus was constructed so that test words and part-words were matched in frequency, but differed in their transitional probabilities. Infants showed reliable discrimination of words from part-words, thereby demonstrating rapid segmentation of continuous speech into words on the basis of transitional probabilities of syllable pairs.
The parameter identification of channel codes plays a significant role in the fields of adaptive modulation and coding (AMC) as well as non-cooperative communications. In this paper, an algorithm based on probability statistics and Galois field Fourier transform (PS-GFFT) is proposed to identify the parameters of the Reed-Solomon (RS) codes. A threshold obtained by the probability statistics is used to skip wrong parameters within a candidate set, while GFFT is applied to reduce the error identification probability. Meanwhile, the upper bound on correct recognition rate of RS codes has been derived and proved, which quantifies the influence of the received codewords' length, the bit-error-rate of codewords, and the number of bits per symbol on the accuracy of parameters estimation. To the best of our knowledge, the upper bound, which is of great significance in evaluating the performance of recognition algorithms, is provided in this paper for the first time. The numerous simulation results illustrate that the proposed algorithm has better recognition performance than the existing RS codes recognition algorithms. Specifically, the correct recognition probability of the RS codes whose length is no more than 255 can be over 90% when the bit error rate of codewords is below 3 * 10 −3 , while the conventional algorithms have the best correct recognition probability of 10%. Furthermore, it is observed that the correct recognition rate of our proposed algorithm is close to the derived upper bound, especially for long code length, which further verifies the superiority of our proposed algorithm.
We discuss some recent results related to the deduction of a suitable probabilistic model for the description of the statistical features of a given deterministic dynamics. More precisely, we motivate and investigate the computability of invariant measures and some related concepts. We also present some experiments investigating the limits of naive simulations in dynamics.
In view of the lack of intelligent guidance in online teaching of English composition, this paper proposes an intelligent support system for English writing based on B/S mode. On the basis of vocabulary, grammar rules and other corpus, this system uses Natural Language Processing technology, which combines rule matching and probability statistics, to evaluate and optimize the efficiency of the composition. The empirical results show that the system can effectively improve the teaching direction according to the results of intelligent quantitative analysis.