## ABSTRACT

The origin of the allometric relationship between standard metabolic rate (MR) and body mass (*M*), often described as MR=*aM ^{b}*, remains puzzling, and interpretation of the mass-scaling exponent,

*b*may depend on the methodological approach, shapes of residuals, coefficient of determination (

*r*

^{2}) and sample size. We investigated the mass scaling of MRs within and between species of Carabidae beetles. We used ordinary least squares (OLS) regression, phylogenetically generalized least squares (PGLS) regression and standardized major axis (SMA) regression to explore the effects of different model-fitting methods and data clustering caused by phylogenetic clades (grade shift) and gas exchange patterns (discontinuous, cyclic and continuous). At the interspecific level, the relationship between MR and

*M*was either negatively allometric (

*b*<1) or isometric (

*b*=1), depending on the fitting method. At the intraspecific level, the relationship either did not exist or was isometric or positively allometric (

*b*>1), and the fit was significantly improved after the analyzed dataset was split according to gas exchange patterns. The studied species originated from two distinct phylogenetic clades that had different intercepts but a common scaling exponent (OLS, 0.61) that was much shallower than the scaling exponent for the combined dataset for all species (OLS, 0.71). The best scaling exponent estimates were obtained by applying OLS while accounting for grade shifts or by applying PGLS. Overall, we show that allometry of MR in insects can depend heavily on the model fitting method, the structure of phylogenetic non-independence and ecological factors that elicit different modes of gas exchange.

## INTRODUCTION

Metabolism enables organisms to transform matter from the environment into body structures and functions. Lavoisier's respiratory experiments conducted on humans and animals in the 18th century accelerated physiology into a new era (for a review, see Karamanou and Androutsos, 2013), and since that time, scientists have been interested in estimating metabolic rates (MRs) by measuring the rates of oxygen consumption or carbon dioxide release. More recently, it has become clear that a mutual link between the MRs of organisms and ecological processes may affect numerous physiological and ecological patterns, e.g. population densities and life-history evolution (Killen et al., 2010; Kozłowski et al., 2004; Rasmussen and Izard, 1988; White and Seymour, 2004). Metabolic rates depend on different organismal properties, but they especially reflect body mass (*M*) differences among organisms (Kleiber, 1961; Schmidt-Nielsen, 1984), and this relationship is often approximated with a simple power function, MR=*aM ^{b}*, with a coefficient

*a*and a mass-scaling exponent

*b*(Sibly et al., 2012).

The history of research on the origin of MR mass scaling is tortuous; specifically, the pioneering studies, as well as more recent studies at the end of the 20th century, claimed that the mass-scaling exponent *b* has a universal value in all organisms, which equals either 0.67 (surface law; Sarrus and Rameaux, 1839) or 0.75 (Kleiber’s law; Kleiber, 1932; West et al., 1997, 1999). However, other values of *b* and a high variance in *b* were reported at the same time, challenging the idea of universal mass-scaling laws for MRs (e.g. Bokma, 2004; da Silva et al., 2006; Downs et al., 2008; Glazier, 2005; Heusner, 1982; Hulbert, 2014; Kozlowski and Konarzewski, 2004; Snelling et al., 2011; White et al., 2009).

The ongoing debate about scaling exponents increasingly addresses methodological factors that can lead to different results and conclusions from the same datasets. It is unarguable that if the *b* value is biologically meaningful, then applying an appropriate methodological approach to its estimation is of great importance. The choice of a statistical method to estimate the value of an exponent is one of the most obvious decisions (Hui et al., 2010; White, 2011). In a classical approach, a regression line is fitted to the data by the ordinary least squares (OLS) technique (Schmidt-Nielsen, 1984), for which the following assumptions should be met: (1) statistical independence of data points, (2) mass measured without error and (3) mass independent of the variable of interest (here, MR). However, these assumptions are often violated, especially in interspecific analyses. For example, shared phylogenetic ancestry violates assumption 1, and an intraspecific variation in *M* violates assumption 2 if a single average value is calculated from a sample representing an entire taxon. An alternative method, called standardized major axis (SMA, which is also known as reduced major axis; Warton et al., 2006) regression, does not require assumption 2 because it simultaneously minimizes the vertical and horizontal distances of data points from a fitted line, as opposed to only the vertical distances as in the OLS method. However, SMA regression assumes that the error variances are proportional to the total variance, and this assumption can often be violated (Kilmer and Rodriguez, 2017; McArdle, 2003). Estimations of a scaling exponent with these two models depend on *r*^{2}; when *r*^{2} is small, the discrepancy between these two models is large. In the case of a low *r*^{2} and either a high measurement error in the *M* or, in the case of interspecific scaling, high within-species variance in *M*, the *b* value is likely to be underestimated by the OLS model (Hui et al., 2010). Roughly, if the error variance in the *M* is less than one-third that of the MR, then the *b* value in the SMA model is likely to be an overestimate, so OLS would be recommended (McArdle, 1988). However, Kilmer and Rodriguez (2017) stress that this is only a rule of thumb and not a mathematically established procedure. Information regarding the measurement errors for *M* and MR should be considered when deciding which regression model, OLS or SMA, to use (White, 2011), but this practice is not routinely applied in mass-scaling studies. With developments in statistical software, the use of SMA has become more common in recent years (Niklas, 2004; White and Seymour, 2005a), as have phylogenetically informed methods (Chown et al., 2007; Ehnes et al., 2011; White et al., 2009) that help meet assumption 1. Individual taxa may not be statistically independent because of their phylogenetic relationships within a phylogenetic tree (Symonds and Elgar, 2002), and ignoring this non-independence may result in biased estimations of the exponent *b*. For example, if two clades have exactly the same exponent but differ in their average *M* and mass-independent MRs, fitting a single regression to all pooled data would generate a *b* value that differs from the actual scaling exponent within each clade (Martin et al., 2005). To add further complexity, specific taxa may be characterized by different biological characteristics that evolved to cope with environmental pressures and that ultimately affect MRs (McNab, 2008, 2009). If environmental pressures change with *M*, then one might expect a systematic change in biological characteristics with body mass, which can further translate into a given value of a scaling exponent (McNab, 1988). For example, during a resting and postabsorptive state, some insects employ the discontinuous (DGE), cyclic or continuous gas exchange modes (Marais et al., 2005). In some species of mite-free beetles, ants and grasshoppers, DGE is associated with lower MRs (Gibbs and Johnson, 2004; Gudowska et al., 2016; Huang et al., 2015), whereas in cockroaches and caterpillars, no significant differences in the rate of CO_{2} release (*V̇*_{CO2}) have been reported (Groenewald et al., 2013; Williams et al., 2010). Interestingly, engagement in a given gas exchange pattern can depend on biotic and abiotic environmental factors (reviewed in Chown et al., 2006; Contreras et al., 2014; White et al., 2007a). For example, DGE has been hypothesized to be a response to habitat aridity (Buck, 1953; Huang et al., 2015; Lighton, 1996; Sláma et al., 2007; Terblanche et al., 2008a), a hypoxic and/or hypercapnic environment (Lighton, 1998; Lighton and Berrigan, 1995) and parasite infestation (Gudowska et al., 2016; Miller, 1974).

In this study, we examined the relationship between *M* and standard MRs in 20 species of carabid beetles to address the following questions. (1) Do mass-scaling exponents on the interspecific and intraspecific levels agree with the theoretical values of 0.67 or 0.75, or 0.82, which is a non-phylogenetically corrected interspecific scaling exponent for insects (Chown et al., 2007)? (2) Does the topology of a phylogenetic tree affect the scaling exponent estimates, or can all species be treated as one uniform group (Carabidae)? (3) Does accounting for different gas exchange patterns affect the estimation of the scaling exponent? (4) Is the estimation of an exponent value affected by the statistical method?

## MATERIALS AND METHODS

### Animals

Carabid beetles (*Cychrus attenuatus*, *Carabus cancellatus*, *Ca. linnei*, *Ca. convexus*, *Ca. hortensis*, *Ca. glabratus*, *Ca. problematicus*, *Ca. arvensis*, *Ca. granulatus*, *Ca. nemoralis*, *Ca. auronitens*, *Ca. violaceus*, *Ca. coriaceus*, *Leistus rufomarginatus*, *Pterostichus burmeisteri*, *P. oblongopunctatus*, *Molops piceus*, *Abax schueppeli*, *A. ovalis* and *A. parallelepipedus*) were collected in autumn 2012 and spring 2013 in southern Poland (49°48′19.9″N, 19°56′01.8″E; 49°43′50.1″N, 19°52′46.0″E; permit number DOP-oz.6401.01.12.2013 JRO) using Barber pitfall traps. Specimens were housed in a climate chamber at 20°C and 70% relative humidity under a 12 h:12 h light:dark regime for 1 to 4 weeks before the MR measurements were taken. The beetles were fed mealworm (*Tenebrio molitor*) larvae *ad libitum* and then fasted for at least 17 h prior to the MR measurements to avoid respiration during the absorptive state.

### Respirometry

All individuals were weighed to the nearest 0.01 mg immediately before and after MR measurement (XP26, Mettler-Toledo, Greifensee, Switzerland). To constrain the locomotor activity, the beetles were wrapped in synthetic gauze before being placed in individual 20-ml glass chambers. Respirometry measurements were performed in a climate chamber (Pol-Eko Aparatura, Wodzisław Śląski, Poland) at 20°C, and *V̇*_{CO2} was measured using an eight-chamber (seven with beetles, one baseline) multiplexing respirometry system (Sable Systems International, Las Vegas, NV, USA) programmed to cycle among each of the metabolic chambers every 11 h. The ambient airstream from outside the building was divided into eight streams flowing at a rate of 50 ml min^{−1}, which were controlled by a mass flow controller (Sidetrack Mass Flow Controller, Sierra Instruments, Monterey, CA USA), and maintained at 60% relative humidity (DG-4 Dewpoint generator, Sable Systems); the air leaving the experimental chamber was dried with magnesium perchlorate (Merck, Darmstadt, Germany). Every second, the active chamber's airflow was directed through the multiplexer to an infrared CO_{2} gas analyzer (Li-7000, Li-Cor, Lincoln, NE, USA) to record the *V̇*_{CO2}. Each individual (*n*=562) was measured twice (once during the day and once during the night); each time, the individual was measured continuously for 80 min with a 10-min baseline measurement between consecutive measurements of different individuals. The chambers of animals not being measured were flushed with air of similar relative humidity and at a similar rate of flow.

### Data extraction and analyses

Prior to the analyses, the data were corrected to standard temperature and pressure and converted to ml CO_{2} min^{−1}, baselined and drift-corrected with ExpeData software (version 1.7.15, Sable Systems). Of the day and night measurements of each individual beetle, only the one (day or night) with the lower MR was selected for further analysis. Three different gas exchange patterns – DGE, cyclic and continuous – were determined based on the *V̇*_{CO2}, as defined by Marais et al. (2005) and Chown (2011) with modifications by Gudowska et al. (2016, 2017). First, the gas exchange patterns were classified as continuous when half or more of the records were above the average value, and second, periodic patterns (DGE or cyclic) were recognized as regular peaks and troughs in gas exchange and when less than 30% of the data was above the average. The next step, in which the minimum *V̇*_{CO2} was calculated as a percentage of the mean *V̇*_{CO2}, allowed differentiation between the latter two patterns. During DGE, the gas exchange with the environment dropped to values close to zero owing to the spiracles being completely closed, but cuticular gas exchange may still have occurred (Groenewald et al., 2012; Terblanche et al., 2008a). The minimum *V̇*_{CO2} was either below 6% or above 11% of the mean *V̇*_{CO2}, and no intermediate values were found. Thus, we classified the trace *V̇*_{CO2} as representing DGE when the calculated value did not exceed 6% and as the cyclic pattern when the value was above 11%. For measurements in which individuals maintained the DGE or cyclic pattern, the mean *V̇*_{CO2} from three consecutive cycles (median value: one to nine cycles for DGE and one to eight cycles for cyclic) were calculated. In the case of the continuous respiration mode, the lowest mean *V̇*_{CO2} over a 30-min interval was determined.

All statistical analyses were performed in R (v.3.2.1, www.r-project.org). A phylogenetic tree for the studied species (Fig. 1) was reconstructed following Deuve et al. (2012), Maddison et al. (1999) and Raupach et al. (2010), and because the branch lengths of Carabidae beetles have been poorly studied, we assumed an ultrametric tree (Gudowska et al., 2016). The studied genera clustered into two distinct clades (hereafter, clades 1 and 2), and this clustering was further considered to study the effect of a grade shift. Clade 1 included *Cychrus* and *Carabus*, and clade 2 included *Leistus*, *Pterostichus*, *Molops* and *Abax*. Statistical analyses were performed at the intraspecific and interspecific levels with pooled data on males and females. The intraspecific analyses were conducted separately for each species using individual measures represented by data points; note that for the intraspecific analysis of MR versus *M*, we selected the six most abundant species (*n*>37 for each species). The interspecific analyses used species-specific mean values, and they were performed on either all species pooled together or considering the clustering (clade) information. Prior to the analyses, all data were log_{10} transformed to normalize the distributions and to linearize the allometric relationship between MR and *M* [logMR=log(*a*+*b*)log*M*], and the slope of this regression was used to estimate the exponent *b* and the regression intercept to estimate the coefficient *a*.

To test whether MR depended on *M*, we first performed a simple correlation analysis, and to examine the values of the exponent *b* and the coefficient *a*, we then fitted regression lines using the OLS, phylogenetically generalized least squares (PGLS) and SMA techniques. The PGLS was performed with the pgls function in the caper package (https://CRAN.R-project.org/package=caper), but it was only used for the interspecific level. After fitting the regressions, we compared the mass-scaling exponent *b* between the clades and tested whether the empirical *b* deviated from the theoretical values of 0.67 (surface law), 0.75 (Kleiber's law) or the empirical value of 0.82. If the clades proved to have similar exponents, we assumed a common exponent *b* and compared the regression intercepts (log *a*) between clades. The OLS and SMA regressions were fitted, and the empirical values of *b* were compared with the values of 0.67, 0.75 and 0.82 with the smatr package (Warton et al., 2012). Comparisons of regression slopes and intercepts between clades were performed with different methods for OLS and SMA. The OLS comparisons employed a general linear model (GLM), which considered log *M* as a covariate, clade as a fixed factor and the interaction between log *M* and clade. The SMA comparisons were directly performed with the sma function, with the Wald test used to compare regression intercepts and the likelihood ratio test to compare regression slopes.

To compare the mass scaling of MR between different gas exchange patterns (DGE, cyclic and continuous), we reprocessed our data to divide each species into three groups of individuals with each group representing one of the three gas exchange patterns. These data were analyzed differently depending on whether the analysis concerned the interspecific or intraspecific level. For the interspecific analysis, we calculated species mean MR and M values for each gas exchange group. To estimate the exponent *b* and coefficient *a*, we used the OLS and SMA methods of the sma function and the PGLS method. To formally test for differences in regression parameters between gas exchange patterns, a general linear mixed model (GLMM) was run in the lme4 (Bates et al., 2015) and lmerTest (https://CRAN.R-project.org/package=lmerTest) packages. The GLMM considered log *M* as a covariate, gas exchange pattern as a fixed factor and species as a random effect component. Regression slopes were compared between the three gas exchange patterns by considering the log *M*×gas exchange pattern interaction. In the intraspecific analysis, we used data on individual beetles, but this analysis was narrowed to two species consisting of individuals with at least two gas exchange patterns and at least 30 individuals per pattern. Consistent with the methods used for the interspecific comparison of the mass scaling of MR between clades, we used the sma function to compare OLS and SMA regressions between different gas exchange patterns; note that these comparisons were performed with separate models for each species. Comparisons of regression slopes and intercepts between gas exchange patters followed interspecific methods used to compare clades.

We used GLM to analyze the interspecific relationships between *M* and the frequency of DGE. A species-specific frequency of beetles with DGE was our dependent variable (arcsine transformed prior to the analysis); log M was a covariate; and a clade was a fixed factor.

## RESULTS

### Interspecific level

According to our interspecific exponent *b* estimates obtained with OLS, SMA and PGLS for all 20 pooled species, MR scaled with *M* with a negative allometry (*b*<1), and the estimates of *b* did not differ significantly from 0.67 and 0.75, but were lower than 0.82 for OLS (Table 1). When we considered the clustering of species in two clades (Fig. 1), we did not find significant differences in the value of *b* between the clades (see exponents in Table 1; for OLS: GLM, *F*_{1,1}=1.99, *P*=0.18; for SMA: likelihood ratio test, *P*=0.08). Assuming a common slope for both clades, we found significant differences between clades in the regression intercepts (for OLS: GLM, *F*_{1,1}=9.64, *P*=0.007; for SMA: Wald test, *P*=0.001). Focusing on OLS, the relationships between MR and *M* in each clade were described by a different regression equation (clade 1: logMR=0.61log*M*–4.41; clade 2: logMR=0.61log*M*–4.56); in other words, for a given *M*, species in clade 1 had higher MRs than species in clade 2 (Fig. 2). We found an indication of a grade shift with larger species having higher MR for their mass, because the estimated common OLS exponent for both clades was equal to 0.61, which was lower than the scaling exponent 0.71 based on all species combined. Note that the PGLS gave the same exponent estimates for each clade as OLS but a much lower exponent estimate when all species were considered together. This means that the division of species into clades completely overcame the effect of phylogeny on the relationship between MR and *M*.

The comparison of the relationships between logMR and log*M* between gas exchange patterns (Table 2, Fig. 3) showed that beetles with the DGE (*n*=17), cyclic (*n*=18) or continuous (*n*=19) gas exchange pattern had a similar mass-scaling slope (GLMM, *P*=0.63), but they differed in the intercept (GLMM, *P*<0.001). The DGE beetles had a lower regression intercept (log *a*) than did the beetles with either the cyclic or continuous gas exchange pattern, whose intercepts did not differ from each other. In other words, at a given *M*, beetles engaged in DGE had lower MRs than the remaining beetles. The GLM results showed that the frequency of DGE in a species increased with *M* (*r*=0.67; *F*_{1,17}=7.89, *P*=0.01), but the frequency of DGE was similar in both clades (*F*_{1,17}=0.12, *P*=0.73).

### Intraspecific level

Within species, significant correlations between MR and *M* were found in five of the six analyzed species: *Abax ovalis* (*n*=64), *Carabus linnei* (*n*=40), *C. nemoralis* (*n*=79), *C. violaceus* (*n*=38) and *Pterostichus burmeisteri* (*n*=57). However, the *r*^{2} was relatively low in all species (0.08 to 0.44), generally indicating that only a rather small part of the variation in MR between individuals was explained by the differences in *M* (Table 3). An estimation of the scaling exponent *b* with OLS indicated that *b* was not statistically distinguishable from 1 in all species (isometry), but the SMA estimation indicated that MR scales hypermetrically with *M* (*b*>1) (Table 3, Fig. 4). The *b* values calculated by SMA were significantly different from either 0.67 or 0.75, but among the OLS estimations, only the *b* of *C. nemoralis* was significantly different from 0.75. The low *r*^{2} resulted in a large difference in the values of the exponents estimated with OLS and SMA for *A. ovalis*, *C. linnei* and *P. burmeisteri*. A very high SMA exponent, at least for *C. nemoralis* and *C. violaceus*, suggests a non-realistically steep within-species relationship between MR and *M* (Table 3, Fig. 4).

The sample size allowed us to compare the MR–*M* relationship in individuals with different modes of gas exchange only in two species (Table 4, Fig. 5). Overall, the scaling exponent *b* estimates either did not differ from 1 (isometry; OLS) or were greater than 1 (positive allometry; SMA). In *C. nemoralis* (45 beetles with DGE and 32 beetles with cyclic), estimates of the scaling exponent *b* differed between the gas exchange patterns in the OLS model (*F*_{1,1}=5.1, *P*<0.05) but not in the SMA model (likelihood ratio test, *P*=0.28). In the latter case, we found a significant difference in intercept between gas exchange patterns (Wald test, *P*<0.001); beetles with DGE had lower MRs. In *C. arvensis* (30 beetles with DGE and 37 beetles with cyclic), the mass-scaling exponents did not differ between the gas exchange patterns (OLS: *F*_{1,1}=0.06, *P*=0.81; SMA: likelihood ratio test, *P*=0.99), but the intercepts differed (OLS: *F*_{1,1}=31.7, *P*<0.001; SMA: Wald test; *P*<0.001); beetles with DGE had slower MRs.

## DISCUSSION

Fitting a line is a complex problem, but the predictive capacity of a variable of interest based solely on body mass is one of its greatest strengths. Our intra- and interspecific analyses of the relationships between MR and *M* in ground beetles – one of the 10 largest animal families – revealed heterogeneity among the mass-scaling exponents, which challenges the idea of universal allometric scaling of MR. Generally, our analyses of the interspecific relationship between MR and *M* resulted in various estimations of the scaling exponent *b*, revealing negative allometry (*b*<1) for OLS, PGLS and SMA or even isometry (*b*=1) for SMA. At the intraspecific level, the mass scaling of MR even exhibited positive allometry (*b*>1) for SMA, at least in some species, but the relationships between MR and *M* were much weaker on the intraspecific level. The smaller body-size ranges observed within species may result in more variable estimates of scaling exponents (Glazier, 2005). The applicability of a universal scaling exponent has been questioned based on interspecific studies of fish (Clarke and Johnston, 1999), amphibians (Hillman and Withers, 1979), reptiles (Chappell and Ellis, 1987; Starostová et al., 2009) and mammals (White and Seymour, 2005a) as well as intraspecific studies of the land snail *Cornu aspersum* (Czarnoleski et al., 2008; Gaitan-Espitia et al., 2013). Consistent with the results of previous studies, our research also clearly shows that metabolic scaling depends on the taxonomic level at which it is established (Chown et al., 2007; Glazier, 2005; Kozlowski and Konarzewski, 2005; White et al., 2007b, 2006); we found that the value of *b* more frequently diverged from the values of 0.67 and 0.75 at an intraspecific than an interspecific level. This pattern is in line with a meta-analysis conducted on 1242 animal species (64 orders) in which 50% of orders showed scaling exponents outside of the 0.68–0.82 range, and 5% were outside of the 0.54–0.95 range (Isaac and Carbone, 2010). Overall, our results and emerging evidence from other studies suggest that the heterogeneity of scaling exponents is a fact, and deviations in scaling exponents from 0.67 or 0.75 must be treated as a rule rather than an exception.

Numerous factors affecting scaling exponent and intercept estimation have been recognized based on real biological datasets (Capellini et al., 2010; Ehnes et al., 2011; Terblanche et al., 2008b; White and Kearney, 2014; White and Seymour, 2005b) or randomly generated datasets (Hui et al., 2010; Kilmer and Rodriguez, 2017; Warton et al., 2006). Here, we provided a coherent dataset that permitted the testing of factors discussed frequently in literature, e.g. an appropriate best-fit line model, as well as those that have been neglected despite their great importance, e.g. the objective recognition of respiration patterns and topology of a phylogenetic tree. Often in studies of MR mass scaling, and especially in meta-analyses addressing this phenomenon, all species are treated equally even if they cluster in different genera, orders or families (Dodds et al., 2001; Heusner, 1991; Savage et al., 2004). The determination of a single model for heterogeneous data can lead to misleading results and biased scaling exponent estimations, which was recognized decades ago by Heusner (1982) in the effects of a grade shift. The grade shift between two clades of carabid beetles in our data indicate the need to fit two separate regressions that are characterized by a common scaling exponent but different intercepts. Because the clade with heavier-than-average species has a higher elevation, we found a steeper scaling exponent when all species were treated as one group (OLS, *b*=0.71; Table 1) than when the two clades were considered separately (common scaling exponent *b*=0.61; Fig. 2). Consistent with the bias caused by grade shifts, estimations of the mass-scaling exponent calculated separately for each clade resembled an exponent value calculated from all data if we used phylogenetically informed modeling (PGLS), but the information about the different elevations was lost. The unwanted effects of a grade shift were demonstrated by Martin et al. (2005), who analyzed the scaling of the gestation period in 429 placental mammal species. Overall, the scaling exponent for a single distribution was higher than that of two separately fitted lines for altricial neonates and precocial neonates. Furthermore, within those two major groups, further grade distinctions between taxonomic categories (lipotyphylan insectivores and carnivores, myomorph rodents, primates, artiodactyls and hystricomorph rodents) were detected.

Our results also suggest that conclusions about the mass scaling of metabolic rates can depend on organismal characteristics that systematically change with body mass and that are either inherent to study organisms or induced by the environment. When we analyzed data for *C. arvensis* at an intraspecific level, we observed a very poor correlation between MR and *M*, but this relationship was significantly improved after the analyzed dataset was split according to the type of gas exchange into individuals with either DGE or cyclic patterns (Table 4, Fig. 4). In addition, we found that the frequency of DGE increased with *M*, demonstrating that the type of gas exchange may influence the shape of the relationship between MR and *M* via its link to MR. If gas exchange patterns are heritable (Schimpf et al., 2013), there may also be a heritable component in the intraspecific scaling of metabolic rates. At least six insect orders – Blattodea, Orthoptera, Coleoptera, Lepidoptera, Hymenoptera and Phasmatodea – exhibit all three gas exchange patterns (Marais et al., 2005; Thienel et al., 2015). It is not known whether gas exchange patterns are the cause of differences in *V̇*_{CO2} production in some species (Gibbs and Johnson, 2004; Gudowska et al., 2016; Huang et al., 2015), but it is known that gas exchange patterns are related to environmental conditions (Chown et al., 2006; White et al., 2007a). The question of whether environmental conditions differentially affect organisms that differ in *M* has been investigated in different insect species (reviewed in Addo-Bediako et al., 2002), but future studies of allometric scaling in arthropods should also account for different gas exchange patterns to explain the nature of the variation around the central tendencies in scaling exponents. Accordingly, a meta-analysis conducted by Terblanche et al. (2008b) on 80 insect species demonstrated that the best model of the cycle frequency–mass scaling relationship included both phylogenetic information and gas exchange type.

In addition to the dependence of metabolic mass scaling on phylogenetic structure, our study highlights the importance of the line-fitting method used to study the relationship between *X* (log_{10}*M*) and *Y* (log_{10}MR). Any line fit to the data requires a specific methodological approach and creates trade-offs. All will have biased parameter estimates owing to measurement error and/or equation error. Different criteria of bias measures and research objectives will lead to different recommendations (Hui et al., 2010; Kilmer and Rodriguez, 2017; McArdle, 2003). Hui et al. (2010) proposed SMA for predicting *Y*-value with the same frequency distribution as observations, paying attention to the importance of residual shape, sample size and coefficient of determination. By contrast, Kilmer and Rodriguez (2017) argued against using SMA to compensate for the measurement error in *X* and provided a valuable remedy in the form of an attenuation factor, λ, applied to a simple OLS regression. The factor is calculated as var(*X*)/[var(*X*)+var(ε* _{X}*)], where ε

*is the measurement error of*

_{X}*X*. The ultimate estimation of the OLS slope, accounting for the measurement error, is thus

*b*

_{OLS}=

*b*′

_{OLS}/λ, where

*b*′

_{OLS}is an attenuated slope calculated by standard methods. The authors provided six examples from the literature that allowed ε

*to be calculated, all of which concerned morphological traits whose measurement error only minorly affected the slope estimation. In interspecific studies of the mass scaling of MR, equivalents of the main source of the ‘measurement error’ are individual differences in the*

_{X}*M*values within a species (e.g. Kilmer and Rodriguez, 2017). We estimated the attenuation factor λ using the variance in mean species values of log

_{10}

*M*as var(

*X*), and the mean within-species variance of log

_{10}

*M*as var(ε

*). Calculated this way, λ equals 0.97, which means that the OLS slopes reported for our two clades, 0.61, are attenuated by 0.019, so the attenuation of OLS slopes does not change conclusions about interspecific scaling. Such small biases can be a problem for advocates of universal slopes of 0.67 or 0.75, but our results support the notion of variable slopes.*

_{X}In conclusion, data collection and subsequent analysis of scaling exponents require caution. A few potentially interacting problems, such as the choice of an appropriate best-fit line model, the objective recognition of grade shifts, and the respiration pattern in the case of arthropods, now appear to be prerequisites for calculating reasonable estimates. Because these factors are often or even usually neglected, many studies and their interpretations may be influenced by the risk of unreliable statistical artefacts (Heusner, 1982; Martin et al., 2005). Avoiding such statistical artefacts and the incorporation of factors that demonstrably affect MR allometric scaling now appear to be important to the art of disentangling the almost dogmatic discussions about the existence or non-existence of a uniform scaling exponent. Our comprehensive analysis that accounts for phylogeny, grade shifts and the specificities of animal respiration strongly promotes the existence of different scaling exponents.

## Acknowledgements

We thank K. Kapela, L. Kuriańska, A. Sikorska, A. Filiczkowska and P. Gumienny for their help with maintaining the beetles.

## FOOTNOTES

**Competing interests**The authors declare no competing or financial interests.

**Author contributions**Conceptualization: A.G., M.C., U.B., J.K.; Methodology: A.G.; Formal analysis: A.G., A.A.; Investigation: A.G., B.W.S., U.B.; Writing - original draft: A.G.; Writing - review & editing: A.G., B.W.S., M.C., A.A., U.B., J.K.; Visualization: A.G.; Supervision: J.K.; Funding acquisition: A.G., J.K.

**Funding**This research was supported by a National Science Centre Poland (Narodowe Centrum Nauki) MAESTRO grant to J.K. (grant no. 2011/02/A/NZ8/00064) and an ETIUDA grant to A.G. (grant no. 2016/20/T/NZ8/00530), and a Jagiellonian University in Krakow (Uniwersytet Jagielloński w Krakowie) DS grant to J.K. (DS/WBINOZ/INOŚ/757/2016).

**Data availability**Data are available at Figshare: https://doi.org/10.6084/m9.figshare.4775224.v1

- Received March 10, 2017.
- Accepted July 11, 2017.

- © 2017. Published by The Company of Biologists Ltd