We study a model of controlled queueing network, which operates and makes control decisions in discrete time. An underlying random network mode determines the set of available controls in each time slot. Each control decision “produces” a certain vector of “commodities”; it also has associated “traditional” queueing control effect, i.e., it determines traffic (customer) arrival rates, service rates at the nodes, and random routing of processed customers among the nodes. The problem is to find a dynamic control strategy which maximizes a concave utility function H(X), where X is the average value of commodity vector, subject to the constraint that network queues remain stable.We introduce a dynamic control algorithm, which we call Greedy Primal-Dual (GPD) algorithm, and prove its asymptotic optimality. We show that our network model and GPD algorithm accommodate a wide range of applications. As one example, we consider the problem of congestion control of networks where both traffic sources and network processing nodes may be randomly time-varying and interdependent. We also discuss a variety of resource allocation problems in wireless networks, which in particular involve average power consumption constraints and/or optimization, as well as traffic rate constraints.

The Foster–Lyapunov theorem and its variants serve as the primary tools for studying the stability of queueing systems. In addition, it is well known that setting the drift of the Lyapunov function equal to zero in steady state provides bounds on the expected queue lengths. However, such bounds are often very loose due to the fact that they fail to capture resource pooling effects. The main contribution of this paper is to show that the approach of “setting the drift of a Lyapunov function equal to zero” can be used to obtain bounds on the steady-state queue lengths which are tight in the heavy-traffic limit. The key is to establish an appropriate notion of state-space collapse in terms of steady-state moments of weighted queue length differences and use this state-space collapse result when setting the Lyapunov drift equal to zero. As an application of the methodology, we prove the steady-state equivalent of the heavy-traffic optimality result of Stolyar for wireless networks operating under the MaxWeight scheduling policy.

The model is motivated by the problem of load distribution in large-scale cloud-based data processing systems. We consider a heterogeneous service system, consisting of multiple large server pools. The pools are different in that their servers may have different processing speeds and/or different buffer sizes (which may be finite or infinite). We study an asymptotic regime in which the customer arrival rate and pool sizes scale to infinity simultaneously, in proportion to some scaling parameter n. Arriving customers are assigned to the servers by a “router,” according to a pull-based algorithm, called PULL. Under the algorithm, each server sends a “pull-message” to the router, when it becomes idle; the router assigns an arriving customer to a server according to a randomly chosen available pull-message, if there are any, or to a random server, otherwise. Assuming subcritical system load, we prove asymptotic optimality of PULL. Namely, as system scale $$n\rightarrow \infty $$ n → ∞ , the steady-state probability of an arriving customer experiencing blocking or waiting, vanishes. We also describe some generalizations of the model and PULL algorithm, for which the asymptotic optimality still holds.

We consider optimal pricing for a two-station tandem queueing system with finite buffers, communication blocking, and price-sensitive customers whose arrivals form a homogeneous Poisson process. The service provider quotes prices to incoming customers using either a static or dynamic pricing scheme. There may also be a holding cost for each customer in the system. The objective is to maximize either the discounted profit over an infinite planning horizon or the long-run average profit of the provider. We show that there exists an optimal dynamic policy that exhibits a monotone structure, in which the quoted price is non-decreasing in the queue length at either station and is non-increasing if a customer moves from station 1 to 2, for both the discounted and long-run average problems under certain conditions on the holding costs. We then focus on the long-run average problem and show that the optimal static policy performs as well as the optimal dynamic policy when the buffer size at station 1 becomes large, there are no holding costs, and the arrival rate is either small or large. We learn from numerical results that for systems with small arrival rates and no holding cost, the optimal static policy produces a gain quite close to the optimal gain even when the buffer at station1 is small. On the other hand, for systems with arrival rates that are not small, there are cases where the optimal dynamic policy performs much better than the optimal static policy.

For a class of discrete-time queueing systems, we present a new exact method of computing both the expectation and the distribution of the queue length. This class of systems includes the bulk-service queue and the fixed-cycle traffic-light (FCTL) queue, which is a basic model in traffic-control research and can be seen as a non-exhaustive time-limited polling system. Our method avoids finding the roots of the characteristic equation, which enhances both the reliability and the speed of the computations compared to the classical root-finding approach. We represent the queue-length expectation in an exact closed-form expression using a contour integral. We also introduce several realistic modifications of the FCTL model. For the FCTL model for a turning flow, we prove a decomposition result. This allows us to derive a bound on the difference between the bulk-service and FCTL expected queue lengths, which turns out to be small in most of the realistic cases.

We consider a network of infinite-server queues where the input process is a Cox process of the following form: The arrival rate is a vector-valued linear transform of a multivariate generalized (i.e., being driven by a subordinator rather than a compound Poisson process) shot-noise process. We first derive some distributional properties of the multivariate generalized shot-noise process. Then these are exploited to obtain the joint transform of the numbers of customers, at various time epochs, in a single infinite-server queue fed by the above-mentioned Cox process. We also obtain transforms pertaining to the joint stationary arrival rate and queue length processes (thus facilitating the analysis of the corresponding departure process), as well as their means and covariance structure. Finally, we extend to the setting of a network of infinite-server queues.

In this paper, a stochastic model of a call center with a two-level architecture is analyzed. A first-level pool of operators answers calls, identifies, and handles non-urgent calls. A call classified as urgent has to be transferred to specialized operators at the second level. When the operators of the second level are all busy, the operator of first-level handling the urgent call is blocked until an operator at the second level is available. Under a scaling assumption, the evolution of the number of urgent calls blocked at level 1 is investigated. It is shown that if the ratio of the number of operators at level 2 and 1 is greater than some threshold, then, essentially, the system operates without congestion, with probability close to 1 no urgent call is blocked after some finite time. Otherwise, we prove that a positive fraction of the operators of the first level is blocked due to the congestion of the second level. Stochastic calculus with Poisson processes, coupling arguments and formulations in terms of Skorokhod problems are the main mathematical tools to establish these convergence results.

Stochastic fluid models have been widely used to model the level of a resource that changes over time, where the rate of variation depends on the state of some continuous-time Markov process. Latouche and Taylor (Queueing Syst 63:109-129, 2009) introduced an approach, using matrix analytic methods and the reduced load approximation for loss networks, to analyse networks of fluid models all driven by the same modulating process where the buffers are infinite. We extend the method to networks involving finite buffer models and illustrate the approach by deriving performance measures for a simple network as characteristics such as buffer size are varied. Our results provide insight into the situations where the infinite buffer model is a reasonable approximation to the finite buffer model.

This paper proposes a new algorithm for computing the stationary distribution vector in continuous-time upper block-Hessenberg Markov chains. To this end, we consider the last-block-column-linearly-augmented (LBCL-augmented) truncation of the (infinitesimal) generator of the upper block-Hessenberg Markov chain. The LBCL-augmented truncation is a linearly augmented truncation such that the augmentation distribution has its probability mass only on the last block column. We first derive an upper bound for the total variation distance between the respective stationary distribution vectors of the original generator and its LBCL-augmented truncation. Based on the upper bound, we then establish a series of linear fractional programming (LFP) problems to obtain augmentation distribution vectors such that the bound converges to zero. Using the optimal solutions of the LFP problems, we construct a matrix-infinite-product (MIP) form of the original (i.e., not approximate) stationary distribution vector and develop a sequential update algorithm for computing the MIP form. Finally, we demonstrate the applicability of our algorithm to BMAP/M/ queues and M/M/s retrial queues.

A univariate Hawkes process is a simple point process that is self-exciting and has a clustering effect. The intensity of this point process is given by the sum of a baseline intensity and another term that depends on the entire past history of the point process. Hawkes processes have wide applications in finance, neuroscience, social networks, criminology, seismology, and many other fields. In this paper, we prove a functional central limit theorem for stationary Hawkes processes in the asymptotic regime where the baseline intensity is large. The limit is a non-Markovian Gaussian process with dependent increments. We use the resulting approximation to study an infinite-server queue with high-volume Hawkes traffic. We show that the queue length process can be approximated by a Gaussian process, for which we compute explicitly the covariance function and the steady-state distribution. We also extend our results to multivariate stationary Hawkes processes and establish limit theorems for infinite-server queues with multivariate Hawkes traffic.

We establish a central-limit-theorem (CLT) version of the periodic Little’s law (PLL) in discrete time, which complements the sample-path and stationary versions of the PLL we recently established, motivated by data analysis of a hospital emergency department. Our new CLT version of the PLL extends previous CLT versions of LL. As with the LL, the CLT version of the PLL is useful for statistical applications.

We establish a central-limit-theorem (CLT) version of the periodic Little's law (PLL) in discrete time, which complements the sample-path and stationary versions of the PLL we recently established, motivated by data analysis of a hospital emergency department. Our new CLT version of the PLL extends previous CLT versions of LL. As with the LL, the CLT version of the PLL is useful for statistical applications.

In this paper, we provide convergence analysis for a class of Brownian queues in tandem by establishing an exponential drift condition. A consequence is uniform exponential ergodicity for these multidimensional diffusions, including the O'Connell-Yor process. A list of open problems is also presented.

We consider strategic arrivals to a FCFS service system that starts service at a fixed time and has to serve a fixed number of customers, for example, an airplane boarding system. Arriving early induces a higher waiting cost (waiting before service begins) while arriving late induces a cost because earlier arrivals take the better seats. We first consider arrivals of heterogenous customers that choose arrival times to minimize the weighted sum of waiting cost and cost due to expected number of predecessors. We characterize the unique Nash equilibria for this system. Next, we consider a system offering L levels of priority service with a FCFS queue for each priority level. Higher priorities are charged higher admission prices. Customers make two choicestime of arrival and priority of service. We show that the Nash equilibrium corresponds to the customer types being divided into L intervals and customers belonging to each interval choosing the same priority level. We further analyze the net revenue to the server and consider revenue maximizing strategiesnumber of priority levels and pricing. Numerical results show that with only a small number of queues (two or three) the server can obtain nearly the maximum revenue.