summaryrefslogtreecommitdiffstats
path: root/net/dccp/ccids
Commit message (Collapse)AuthorAgeFilesLines
* [DCCP]: Fix bug in the calculation of very low sending ratesGerrit Renker2007-04-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | This fixes an error in the calculation of t_ipi when X converges towards very low sending rates (between 1 and 64 bytes per second). Although this case may not sound likely, it can be reproduced by connecting, hitting enter (1 byte sent) and waiting for some time, during which the nofeedback timer halves the sending rate until finally it reaches the region 1..64 bytes/sec. Computing X is handled correctly (tested separately); but by dividing X _before_ entering the calculation of t_ipi, X becomes zero as a result. This in turn triggers a BUG condition caught in scaled_div(). Fixed by replacing with equivalent statement and explicit typecast for good measure. Calculation verified and effect of patch tested - reduced never below 1 byte per 64 seconds afterwards, i.e. not allowing divide-by-zero. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CCID3]: Use initial RTT sample from SYN exchangeGerrit Renker2007-04-261-10/+21
| | | | | | | | | | | | | | | The patch follows the following recommendation made in an erratum to RFC 4342: "Senders MAY additionally make use of other available RTT measurements, including those from the initial Request-Response packet exchange." It implements larger initial windows with regard to this inital RTT measurement, using the mechanism suggested in draft-ietf-dccp-rfc3448bis, section 4.2. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CCID3]: Use function for RTT samplingGerrit Renker2007-04-262-41/+11Star
| | | | | | | | | | This replaces the existing occurrences of RTT sampling with the use of the new function dccp_sample_rtt. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CCID3]: Handle Idle and Application-Limited periodsGerrit Renker2007-04-261-44/+40Star
| | | | | | | This updates the code with regard to handling idle and application-limited periods as specified in [RFC 4342, 5.1]. Background:
* [CCID3]: Wrap computation of RFC3390-initial rate into separate functionGerrit Renker2007-04-261-7/+18
| | | | | | | | | | | | | | | The CCID 3 and TFRC specs (RFC 4342, RFC 3448, draft-3448bis) make frequent reference to the computation of the RFC-3390 initial sending rate: 1. Initial sending rate when RTT is known (RFC 4342, p. 6) 2. Response to Idle/Application-Limited periods (RFC 4342, 5.1) This warrants putting the code into its own function, for later code reuse. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CCID3]: Remove build warnings for 64bitGerrit Renker2007-04-261-5/+7
| | | | | | | | | | | | | This clears the following sparc64 build warnings: 1) warning: format "%ld" expects type "long int", but argument 3 has type "suseconds_t" 2) warning: format "%llu" expects type "long long unsigned int", but argument 3 has type "__u64" Fixed by using typecast to unsigned. This is argued to be safe, since the quantities, after de-scaling (factor 2^6) fit all in u32. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CCID3]: Remove race condition and update t_ipi when `s' changesGerrit Renker2007-04-261-23/+12Star
| | | | | | | | | | | | | | | This: 1. removes a race condition in the access to the scheduled send time t_nom which results from allowing asynchronous r/w access to t_nom without locks; 2. updates the inter-packet interval t_ipi = s/X when `s' changes, following a suggestion by Ian McDonald. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CCID3]: More verbose debuggingIan McDonald2007-04-261-1/+12
| | | | | | | | | This adds a few debugging statements to ccid3.c Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CCID3]: Fix use of invalid loss intervalsIan McDonald2007-04-261-1/+1
| | | | | | | | | | This fixes a bug which uses an invalid comparison. The bug resulted in the use of invalid loss intervals. Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CCID3]: Use MSS for larger initial windowsGerrit Renker2007-04-261-5/+5
| | | | | | | | | | | | | | | | | | | | This improves the slow-start phase by using the MSS (as suggested in RFC 4342, sec. 5) instead of the packet size s. Also figured out that __u32 is ample resource enough. After applying, I got the following in the logs: ccid3_hc_tx_packet_recv: client(f7421700), s=6, MSS=1424, w_init=4380, R_sample=176us, X=24886363 Had the previous variant been used, w_init would have been as low as 24. Committer note: removed unneeded cast to unsigned long long that was causing a compiler warning on 64bit architectures. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CCID3]: Re-order CCID 3 source fileGerrit Renker2007-04-261-44/+45
| | | | | | | | | | | No code change at all. This splits ccid3.c into a RX and a TX section, so that the file has an organisation similar to the other ones (e.g. packet_history.{h,c}). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [CCID3]: Remove redundant `len' testGerrit Renker2007-04-261-5/+2Star
| | | | | | | | | | | Since CCID3 avoids sending 0-byte data packets (cf. ccid3_hc_tx_send_packet), testing for zero-payload length, as performed by ccid3_hc_tx_update_s, is redundant - hence removed by this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP]: Revert patch which disables bidirectional modeGerrit Renker2007-03-081-6/+1Star
| | | | | | | | | | | | | | | | This reverts an earlier patch which disabled bidirectional mode, meaning that a listening (passive) socket was not allowed to write to the other (active) end of the connection. This mode had been disabled when there were problems with CCID3, but it imposes a constraint on socket programming and thus hinders deployment. A change is included to ignore RX feedback received by the TX CCID3 module. Many thanks to Andre Noll for pointing out this issue. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] DCCP: Fix whitespace errors.YOSHIFUJI Hideaki2007-02-114-52/+52
| | | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP]: Warning fixes.Andrew Morton2007-02-081-2/+3
| | | | | | | | | | | net/dccp/ccids/ccid3.c: In function `ccid3_hc_rx_packet_recv': net/dccp/ccids/ccid3.c:1007: warning: long int format, different type arg (arg 3) net/dccp/ccids/ccid3.c:1007: warning: long int format, different type arg (arg 4) opaque types must be suitably cast for printing. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP] ccid3: return value in ccid3_hc_rx_calc_first_liIan McDonald2006-12-141-3/+3
| | | | | | | | | | | | In a recent patch we introduced invalid return codes which will result in the opposite of what is intended (i.e. send more packets in face of peculiar network conditions). This fixes it by returning ~0 which means not calculated as per dccp_li_hist_calc_i_mean. Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP]: Whitespace cleanupsArnaldo Carvalho de Melo2006-12-113-66/+79
| | | | | | | That accumulated over the last months hackaton, shame on me for not using git-apply whitespace helping hand, will do that from now on. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Fixup some type conversions related to rttsArnaldo Carvalho de Melo2006-12-112-12/+12
| | | | | | | | | | | | Spotted by David Miller when compiling on sparc64, I reproduced it here on parisc64, that are the only platforms to define __kernel_suseconds_t as an 'int', all the others, x86_64 and x86 included typedef it as a 'long', but from the definition of suseconds_t it should just be an 'int' on platforms where it is >= 32bits, it would not require all the castings from suseconds_t to (int) when printking variables of this type, that are not needed on parisc64 and sparc64. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: BUG-FIX - conversion errorsGerrit Renker2006-12-111-24/+30
| | | | | | | | | | | | | This fixes conversion errors which arose by not properly type-casting from u32 to __u64. Fixed by explicitly casting each type which is not __u64, or by performing operation after assignment. The patch further adds missing debug information to track the current value of X_recv. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Reorder packet history source fileGerrit Renker2006-12-111-107/+112
| | | | | | | | | | | No code change at all. This reorders the source file to follow the same order as the corresponding header file. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Reorder packet history header fileGerrit Renker2006-12-111-60/+67
| | | | | | | | | | | | | | | | | | | No code change at all. To make the header file easier to read, the following ordering is established among the declarations: * hist_new * hist_delete * hist_entry_new * hist_head * hist_find_entry * hist_add_entry * hist_entry_delete * hist_purge Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Make debug output consistentGerrit Renker2006-12-111-35/+32Star
| | | | | | | | | | | | | | This patch does not alter any algorithm, just the debug message format: * s#%s, sk=%p#%s(%p)#g * when a statename is present, it now uses %s(%p, state=%s) * when only function entry is debugged, it adds an `- entry' Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Perform history operations only after packet has been sentGerrit Renker2006-12-111-29/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This migrates all packet history operations into the routine ccid3_hc_tx_packet_sent, thereby removing synchronization problems that occur when, as before, the operations are spread over multiple routines. The following minor simplifications are also applied: * several simplifications now follow from this change - several tests are now no longer required * removal of one unnecessary variable (dp) Justification: Currently packet history operations span two different routines, one of which is likely to pass through several iterations of sleeping and awakening. The first routine, ccid3_hc_tx_send_packet, allocates an entry and sets a few fields. The remaining fields are filled in when the second routine (which is not within a sleeping context), ccid3_hc_tx_packet_sent, is called. This has several strong drawbacks: * it is not necessary to split history operations - all fields can be filled in by the second routine * the first routine is called multiple times, until a packet can be sent, and sleeps meanwhile - this causes a lot of difficulties with regard to keeping the list consistent * since both routines do not have a producer-consumer like synchronization, it is very difficult to maintain data across calls to these routines * the fact that the routines are called in different contexts (sleeping, not sleeping) adds further problems Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: TX history - remove unused fieldGerrit Renker2006-12-112-3/+3
| | | | | | | | | This removes the `dccphtx_ccval' field since it is nowhere used in the code and in fact not necessary for the accounting. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Shift window counter computationGerrit Renker2006-12-111-20/+29
| | | | | | | | | | | | | | | | | | | | | | | This puts the window counter computation [RFC 4342, 8.1] into a separate function which is called whenever a new packet is ready for immediate transmission in ccid3_hc_tx_send_packet. Justification: The window counter update was previously computed after the packet was sent. This has two drawbacks, both fixed by this patch: 1) re-compute another timestamp almost directly after the packet was sent (expensive), 2) the CCVal for the window counter is needed at the instant the packet is sent. Further details: The initialisation of the window counter is left in the state NO_SENT, as before. The algorithm will do nothing if either RTT is initialised to 0 (which is ok) or if the RTT value remains below 4 microseconds (which is almost pathological). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Sanity-check RTT samplesGerrit Renker2006-12-112-1/+13
| | | | | | | | | | | | | | | | | | | CCID3 performance depends much on the accuracy of RTT samples. If RTT samples grow too large, performance can be catastrophically poor. To limit the amount of possible damage in such cases, the patch * introduces an upper limit which identifies a maximum `sane' RTT value; * uses a macro to enforce this upper limit. Using a macro was given preference, since it is necessary to identify the calling function in the warning message. Since exceeding this threshold identifies a critical condition, DCCP_CRIT is used and not DCCP_WARN. Many thanks to Ian McDonald for collaboration on this issue. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Initialise RTT valuesGerrit Renker2006-12-111-1/+2
| | | | | | | | | | | | | | | | | | In both the sender and the receiver it is possible that the stored RTT value is accessed before an actual RTT estimate has been computed. This patch * initialises the sender RTT to 0 - the sender always accesses the RTT in ccid3_hc_tx_packet_sent - the RTT is further needed for the window counter algorithm * replaces the receiver initialisation of 5msec with 0 - which has the same effect and removes an `XXX' - the RTT value is needed in ccid3_hc_rx_packet_recv as rtt_prev Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid: Deprecate ccid_hc_tx_insert_optionsGerrit Renker2006-12-111-12/+0Star
| | | | | | | | | | | | | | | | | The function ccid3_hc_tx_insert_options only does a redundant no-op, as the operation DCCP_SKB_CB(skb)->dccpd_ccval = hctx->ccid3hctx_last_win_count; is already performed _unconditionally_ in ccid3_hc_tx_send_packet. Since there is further no current need for this function, it is removed entirely. Since furthermore, there is actually no present need for the entire interface function ccid_hc_tx_insert_options, it was decided to remove it also, to clean up the interface. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP]: Only deliver to the CCID rx side in chargeGerrit Renker2006-12-111-3/+3
| | | | | | | | | | | | | | | | | | | | | | | This is an optimisation to reduce CPU load. The received feedback is now only directed to the active CCID component, without requiring processing also by the inactive one. As a consequence, a similar test in ccid3.c is now redundant and is also removed. Justification: Currently DCCP works as a unidirectional service, i.e. a listening server is not at the same time a connecting client. As far as I can see, several modifications are necessary until that becomes possible. At the present time, received feedback is both fed to the rx/tx CCID modules. In unidirectional service, only one of these is active at any one time. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP]: Simplify TFRC calculationGerrit Renker2006-12-113-35/+35
| | | | | | | | | | | | | | | | | | | | | In migrating towards using the newer functions scaled_div/scaled_div32 for TFRC computations mapped from floating-point onto integer arithmetic, this completes the last stage of modifications. In particular, the overflow case for computing X_calc is circumvented by * breaking the computation into two stages * the first stage, res = (s*1E6)/R, cannot overflow due to use of u64 * in the second stage, res = (res*1E6)/f, overflow on u32 is avoided due to (i) returning UINT_MAX in this case (which is logically appropriate) and (ii) issuing a warning message into the system log (since very likely there is a problem somewhere else with the parameters) Lastly, all such scaling operations are now exported into tfrc.h, since actually this form of scaled computation is specific to TFRC and not to CCID3. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP]: Debug timeval operationsGerrit Renker2006-12-111-24/+32
| | | | | | | | | | | | | | | | | | | | Problem: Most target types in the CCID3 code are u32, so subtle conversion errors can occur if signed time calculations yield negative results: the original values are lost in the conversion to unsigned, calculation errors go undetected. This patch therefore * sets all critical time types from unsigned to suseconds_t * avoids comparison between signed/unsigned via type-casting * provides ample warning messages in case time calculations are negative These warning messages can be removed at a later stage when the code has undergone more testing. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Simplify calculation for reverse lookup of pGerrit Renker2006-12-111-17/+22
| | | | | | | | | | | | | | | | | | | | | | | This simplifies the calculation of a value p for a given fval when the first loss interval is computed (RFC 3448, 6.3.1). It makes use of the two new functions scaled_div/scaled_div32 to provide overflow protection. Additionally, protection against divide-by-zero is extended - in this case the function will return the maximally possible value of p=100%. Background: The maximum fval, f(100%), is approximately 244, i.e. the scaled value of fval should never exceed 244E6, which fits easily into u32. The problem is the scaling by 10^6, since additionally R(TT) is in microseconds. This is resolved by breaking the division into two stages: the first stage computes fval=(s*10^6)/R, stores that into u64; the second stage computes fval = (fval*10^6)/X_recv and complains if overflow is reached for u32. This case is safe since the TFRC reverse-lookup routine then returns p=100%. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Replace scaled division operationsGerrit Renker2006-12-111-24/+3Star
| | | | | | | | | This replaces the remaining uses of usecs_div with scaled_div32, which internally uses 64bit division and produces a warning on overflow. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Finer-grained resolution of sending ratesGerrit Renker2006-12-112-37/+72
| | | | | | | | | | | | | | | | | | | | | | | This patch * resolves a bug where packets smaller than 32/64 bytes resulted in sending rates of 0 * supports all sending rates from 1/64 bytes/second up to 4Gbyte/second * simplifies the present overflow problems in calculations Current sending rate X and the cached value X_recv of the receiver-estimated sending rate are both scaled by 64 (2^6) in order to * cope with low sending rates (minimally 1 byte/second) * allow upgrading to use a packets-per-second implementation of CCID 3 * avoid calculation errors due to integer arithmetic cut-off The patch implements a revised strategy from http://www.mail-archive.com/dccp@vger.kernel.org/msg01040.html The only difference with regard to that strategy is that t_ipi is already used in the calculation of the nofeedback timeout, which saves one division. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Fix two bugs in sending rate computationGerrit Renker2006-12-111-2/+2
| | | | | | | | | | | | | | This fixes 1) a bug in the recomputation of the sending rate by the nofeedback timer when no feedback at all has so far been sent by the receiver: min_t was used instead of max_t, which is wrong (cf. RFC 3448, p. 10); 2) an error in the computation of larger initial windows: instead of min(... max()) (cf. RFC 4342, 5.), the code had used max(... max()). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Two optimisations for sending rate recomputationGerrit Renker2006-12-111-11/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This performs two optimisations for the recomputation of the sending rate. 1) Currently the target sending rate X_calc is recalculated whenever a) the nofeedback timer expires, or b) a feedback packet is received. In the (a) case, recomputing X_calc is redundant, since * the parameters p and RTT do not change in between the reception of feedback packets; * the parameter X_recv is either modified from received feedback or via the nofeedback timer; * a test (`p == 0') in the nofeedback timer avoids using a stale/undefined value of X_calc if p was previously 0. 2) The nofeedback timer now only recomputes a timestamp when p == 0. This is according to step (4) of [RFC 3448, 4.3] and avoids unnecessarily determining a timestamp. A debug statement about not updating X is also removed - it helps very little in debugging and just clutters the logs. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Check against too large pGerrit Renker2006-12-111-4/+9
| | | | | | | | | | | | This patch follows a suggestion by Ian McDonald and ensures that in the current code the value of p can not exceed 100%. Such a value is illegal and would consequently cause a bug condition in tfrc_calc_x(). The receiver case is also tested, and a warning message is added. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [PATCH] slab: remove kmem_cache_tChristoph Lameter2006-12-072-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] slab: remove SLAB_ATOMICChristoph Lameter2006-12-072-4/+4
| | | | | | | | SLAB_ATOMIC is an alias of GFP_ATOMIC Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [DCCP] tfrc: Binary search for reverse TFRC lookupGerrit Renker2006-12-031-14/+24
| | | | | | | | | | | | | This replaces the linear search algorithm for reverse lookup with binary search. It has the advantage of better scalability: O(log2(N)) instead of O(N). This means that the average number of iterations is reduced from 250 (linear search if each value appears equally likely) down to at most 9. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Deprecate TFRC_SMALLEST_PGerrit Renker2006-12-032-18/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch deprecates the existing use of an arbitrary value TFRC_SMALLEST_P for low-threshold values of p. This avoids masking low-resolution errors. Instead, the code now checks against real boundaries (implemented by preceding patch) and provides warnings whenever a real value falls below the threshold. If such messages are observed, it is a better solution to take this as an indication that the lookup table needs to be re-engineered. Changelog: ---------- This patch * makes handling all TFRC resolution errors local to the TFRC library * removes unnecessary test whether X_calc is 'infinity' due to p==0 -- this condition is already caught by tfrc_calc_x() * removes setting ccid3hctx_p = TFRC_SMALLEST_P in ccid3_hc_tx_packet_recv since this is now done by the TFRC library * updates BUG_ON test in ccid3_hc_tx_no_feedback_timer to take into account that p now is either 0 (and then X_calc is irrelevant), or it is > 0; since the handling of TFRC_SMALLEST_P is now taken care of in the tfrc library Justification: -------------- The TFRC code uses a lookup table which has a bounded resolution. The lowest possible value of the loss event rate `p' which can be resolved is currently 0.0001. Substituting this lower threshold for p when p is less than 0.0001 results in a huge, exponentially-growing error. The error can be computed by the following formula: (f(0.0001) - f(p))/f(p) * 100 for p < 0.0001 Currently the solution is to use an (arbitrary) value TFRC_SMALLEST_P = 40 * 1E-6 = 0.00004 and to consider all values below this value as `virtually zero'. Due to the exponentially growing resolution error, this is not a good idea, since it hides the fact that the table can not resolve practically occurring cases. Already at p == TFRC_SMALLEST_P, the error is as high as 58.19%! Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] tfrc: Identify TFRC table limits and simplify codeGerrit Renker2006-12-031-9/+18
| | | | | | | | | | | | | | | | This * adds documentation about the lowest resolution that is possible within the bounds of the current lookup table * defines a constant TFRC_SMALLEST_P which defines this resolution * issues a warning if a given value of p is below resolution * combines two previously adjacent if-blocks of nearly identical structure into one This patch does not change the algorithm as such. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] tfrc: Add protection against invalid parameters to TFRC routinesGerrit Renker2006-12-031-14/+19
| | | | | | | | | | | | | | | | | | | | | 1) For the forward X_calc lookup, it * protects effectively against RTT=0 (this case is possible), by returning the maximal lookup value instead of just setting it to 1 * reformulates the array-bounds exceeded condition: this only happens if p is greater than 1E6 (due to the scaling) * the case of negative indices can now with certainty be excluded, since documentation shows that the formulas are within bounds * additional protection against p = 0 (would give divide-by-zero) 2) For the reverse lookup, it warns against * protects against exceeding array bounds * now returns 0 if f(p) = 0, due to function definition * warns about minimal resolution error and returns the smallest table value instead of p=0 [this would mask congestion conditions] Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] tfrc: Fix small error in reverse lookup of p for given f(p)Gerrit Renker2006-12-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the following small error in tfrc_calc_x_reverse_lookup. 1) The table is generated by the following equations: lookup[index][0] = g((index+1) * 1000000/TFRC_CALC_X_ARRSIZE); lookup[index][1] = g((index+1) * TFRC_CALC_X_SPLIT/TFRC_CALC_X_ARRSIZE); where g(q) is 1E6 * f(q/1E6) 2) The reverse lookup assigns an entry in lookup[index][small] 3) This index needs to match the above, i.e. * if small=0 then p = (index + 1) * 1000000/TFRC_CALC_X_ARRSIZE * if small=1 then p = (index+1) * TFRC_CALC_X_SPLIT/TFRC_CALC_X_ARRSIZE These are exactly the changes that the patch makes; previously the code did not conform to the way the lookup table was generated (this difference resulted in a mean error of about 1.12%). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] tfrc: Document boundaries and limits of the TFRC lookup tableGerrit Renker2006-12-031-56/+88
| | | | | | | | | | | | | | | This adds documentation for the TCP Reno throughput equation which is at the heart of the TFRC sending rate / loss rate calculations. It spells out precisely how the values were determined and what they mean. The equations were derived through reverse engineering and found to be fully accurate (verified using test programs). This patch does not change any code. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Fix warning message about illegal ACKGerrit Renker2006-12-031-1/+2
| | | | | | | | | | | | | | | | | | | | This avoids a (harmless) warning message being printed at the DCCP server (the receiver of a DCCP half connection). Incoming packets are both directed to * ccid_hc_rx_packet_recv() for the server half * ccid_hc_tx_packet_recv() for the client half The message gets printed since on a server the client half is currently not sending data packets. This is resolved for the moment by checking the DCCP-role first. In future times (bidirectional DCCP connections), this test may have to be more sophisticated. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ccid3: Fix bug in calculation of send rateGerrit Renker2006-12-031-37/+52
| | | | | | | | | | | | | | | | | | | The main object of this patch is the following bug: ==> In ccid3_hc_tx_packet_recv, the parameters p and X_recv were updated _after_ the send rate was calculated. This is clearly an error and is resolved by re-ordering statements. In addition, * r_sample is converted from u32 to long to check whether the time difference was negative (it would otherwise be converted to a large u32 value) * protection against RTT=0 (this is possible) is provided in a further patch * t_elapsed is also converted to long, to match the type of r_sample * adds a a more debugging information regarding current send rates * various trivial comment/documentation updates Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP]: Fix BUG in retransmission delay calculationGerrit Renker2006-12-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | This bug resulted in ccid3_hc_tx_send_packet returning negative delay values, which in turn triggered silently dequeueing packets in dccp_write_xmit. As a result, only a few out of the submitted packets made it at all onto the network. Occasionally, when dccp_wait_for_ccid was involved, this also triggered a bug warning since ccid3_hc_tx_send_packet returned a negative value (which in reality was a negative delay value). The cause for this bug lies in the comparison if (delay >= hctx->ccid3hctx_delta) return delay / 1000L; The type of `delay' is `long', that of ccid3hctx_delta is `u32'. When comparing negative long values against u32 values, the test returned `true' whenever delay was smaller than 0 (meaning the packet was overdue to send). The fix is by casting, subtracting, and then testing the difference with regard to 0. This has been tested and shown to work. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP]: Use higher RTO default for CCID3Gerrit Renker2006-12-032-8/+46
| | | | | | | | | | | | | | | | | The TFRC nofeedback timer normally expires after the maximum of 4 RTTs and twice the current send interval (RFC 3448, 4.3). On LANs with a small RTT this can mean a high processing load and reduced performance, since then the nofeedback timer is triggered very frequently. This patch provides a configuration option to set the bound for the nofeedback timer, using as default 100 milliseconds. By setting the configuration option to 0, strict RFC 3448 behaviour can be enforced for the nofeedback timer. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP]: Use `unsigned' for packet lengthsGerrit Renker2006-12-032-48/+40Star
| | | | | | | | | | | | | | | | | | | | | | This patch implements a suggestion by Ian McDonald and 1) Avoids tests against negative packet lengths by using unsigned int for packet payload lengths in the CCID send_packet()/packet_sent() routines 2) As a consequence, it removes an now unnecessary test with regard to `len > 0' in ccid3_hc_tx_packet_sent: that condition is always true, since * negative packet lengths are avoided * ccid3_hc_tx_send_packet flags an error whenever the payload length is 0. As a consequence, ccid3_hc_tx_packet_sent is never called as all errors returned by ccid_hc_tx_send_packet are caught in dccp_write_xmit 3) Removes the third argument of ccid_hc_tx_send_packet (the `len' parameter), since it is currently always set to skb->len. The code is updated with regard to this parameter change. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>