go-bgp

a collection of golang BGP tools to monitor, archive and serve
git clone git://git.2f30.org/go-bgp
Log | Files | Refs | README

rfc1771.txt (131903B)


      1 
      2 
      3 
      4 
      5 
      6 
      7 Network Working Group                                         Y. Rekhter
      8 Request for Comments: 1771        T.J. Watson Research Center, IBM Corp.
      9 Obsoletes: 1654                                                    T. Li
     10 Category: Standards Track                                  cisco Systems
     11                                                                  Editors
     12                                                               March 1995
     13 
     14 
     15                   A Border Gateway Protocol 4 (BGP-4)
     16 
     17 Status of this Memo
     18 
     19    This document specifies an Internet standards track protocol for the
     20    Internet community, and requests discussion and suggestions for
     21    improvements.  Please refer to the current edition of the "Internet
     22    Official Protocol Standards" (STD 1) for the standardization state
     23    and status of this protocol.  Distribution of this memo is unlimited.
     24 
     25 Abstract
     26 
     27    This document, together with its companion document, "Application of
     28    the Border Gateway Protocol in the Internet", define an inter-
     29    autonomous system routing protocol for the Internet.
     30 
     31 1. Acknowledgements
     32 
     33    This document was originally published as RFC 1267 in October 1991,
     34    jointly authored by Kirk Lougheed (cisco Systems) and Yakov Rekhter
     35    (IBM).
     36 
     37    We would like to express our thanks to Guy Almes (ANS), Len Bosack
     38    (cisco Systems), and Jeffrey C. Honig (Cornell University) for their
     39    contributions to the earlier version of this document.
     40 
     41    We like to explicitly thank Bob Braden (ISI) for the review of the
     42    earlier version of this document as well as his constructive and
     43    valuable comments.
     44 
     45    We would also like to thank Bob Hinden, Director for Routing of the
     46    Internet Engineering Steering Group, and the team of reviewers he
     47    assembled to review the previous version (BGP-2) of this document.
     48    This team, consisting of Deborah Estrin, Milo Medin, John Moy, Radia
     49    Perlman, Martha Steenstrup, Mike St. Johns, and Paul Tsuchiya, acted
     50    with a strong combination of toughness, professionalism, and
     51    courtesy.
     52 
     53 
     54 
     55 
     56 
     57 
     58 Rekhter & Li                                                    [Page 1]
     59 
     60 RFC 1771                         BGP-4                        March 1995
     61 
     62 
     63    This updated version of the document is the product of the IETF IDR
     64    Working Group with Yakov Rekhter and Tony Li as editors. Certain
     65    sections of the document borrowed heavily from IDRP [7], which is the
     66    OSI counterpart of BGP. For this credit should be given to the ANSI
     67    X3S3.3 group chaired by Lyman Chapin (BBN) and to Charles Kunzinger
     68    (IBM Corp.) who was the IDRP editor within that group.  We would also
     69    like to thank Mike Craren (Proteon, Inc.), Dimitry Haskin (Bay
     70    Networks, Inc.), John Krawczyk (Bay Networks, Inc.), and Paul Traina
     71    (cisco Systems) for their insightful comments.
     72 
     73    We would like to specially acknowledge numerous contributions by
     74    Dennis Ferguson (MCI).
     75 
     76    The work of Yakov Rekhter was supported in part by the National
     77    Science Foundation under Grant Number NCR-9219216.
     78 
     79 2.  Introduction
     80 
     81    The Border Gateway Protocol (BGP) is an inter-Autonomous System
     82    routing protocol.  It is built on experience gained with EGP as
     83    defined in RFC 904 [1] and EGP usage in the NSFNET Backbone as
     84    described in RFC 1092 [2] and RFC 1093 [3].
     85 
     86    The primary function of a BGP speaking system is to exchange network
     87    reachability information with other BGP systems.  This network
     88    reachability information includes information on the list of
     89    Autonomous Systems (ASs) that reachability information traverses.
     90    This information is sufficient to construct a graph of AS
     91    connectivity from which routing loops may be pruned and some policy
     92    decisions at the AS level may be enforced.
     93 
     94    BGP-4 provides a new set of mechanisms for supporting classless
     95    interdomain routing.  These mechanisms include support for
     96    advertising an IP prefix and eliminates the concept of network
     97    "class" within BGP.  BGP-4 also introduces mechanisms which allow
     98    aggregation of routes, including aggregation of AS paths.  These
     99    changes provide support for the proposed supernetting scheme [8, 9].
    100 
    101    To characterize the set of policy decisions that can be enforced
    102    using BGP, one must focus on the rule that a BGP speaker advertise to
    103    its peers (other BGP speakers which it communicates with) in
    104    neighboring ASs only those routes that it itself uses.  This rule
    105    reflects the "hop-by-hop" routing paradigm generally used throughout
    106    the current Internet.  Note that some policies cannot be supported by
    107    the "hop-by-hop" routing paradigm and thus require techniques such as
    108    source routing to enforce.  For example, BGP does not enable one AS
    109    to send traffic to a neighboring AS intending that the traffic take a
    110    different route from that taken by traffic originating in the
    111 
    112 
    113 
    114 Rekhter & Li                                                    [Page 2]
    115 
    116 RFC 1771                         BGP-4                        March 1995
    117 
    118 
    119    neighboring AS.  On the other hand, BGP can support any policy
    120    conforming to the "hop-by-hop" routing paradigm.  Since the current
    121    Internet uses only the "hop-by-hop" routing paradigm and since BGP
    122    can support any policy that conforms to that paradigm, BGP is highly
    123    applicable as an inter-AS routing protocol for the current Internet.
    124 
    125    A more complete discussion of what policies can and cannot be
    126    enforced with BGP is outside the scope of this document (but refer to
    127    the companion document discussing BGP usage [5]).
    128 
    129    BGP runs over a reliable transport protocol.  This eliminates the
    130    need to implement explicit update fragmentation, retransmission,
    131    acknowledgement, and sequencing.  Any authentication scheme used by
    132    the transport protocol may be used in addition to BGP's own
    133    authentication mechanisms.  The error notification mechanism used in
    134    BGP assumes that the transport protocol supports a "graceful" close,
    135    i.e., that all outstanding data will be delivered before the
    136    connection is closed.
    137 
    138    BGP uses TCP [4] as its transport protocol.  TCP meets BGP's
    139    transport requirements and is present in virtually all commercial
    140    routers and hosts.  In the following descriptions the phrase
    141    "transport protocol connection" can be understood to refer to a TCP
    142    connection.  BGP uses TCP port 179 for establishing its connections.
    143 
    144    This document uses the term `Autonomous System' (AS) throughout.  The
    145    classic definition of an Autonomous System is a set of routers under
    146    a single technical administration, using an interior gateway protocol
    147    and common metrics to route packets within the AS, and using an
    148    exterior gateway protocol to route packets to other ASs.  Since this
    149    classic definition was developed, it has become common for a single
    150    AS to use several interior gateway protocols and sometimes several
    151    sets of metrics within an AS.  The use of the term Autonomous System
    152    here stresses the fact that, even when multiple IGPs and metrics are
    153    used, the administration of an AS appears to other ASs to have a
    154    single coherent interior routing plan and presents a consistent
    155    picture of what destinations are reachable through it.
    156 
    157    The planned use of BGP in the Internet environment, including such
    158    issues as topology, the interaction between BGP and IGPs, and the
    159    enforcement of routing policy rules is presented in a companion
    160    document [5].  This document is the first of a series of documents
    161    planned to explore various aspects of BGP application.  Please send
    162    comments to the BGP mailing list (bgp@ans.net).
    163 
    164 
    165 
    166 
    167 
    168 
    169 
    170 Rekhter & Li                                                    [Page 3]
    171 
    172 RFC 1771                         BGP-4                        March 1995
    173 
    174 
    175 3.  Summary of Operation
    176 
    177    Two systems form a transport protocol connection between one another.
    178    They exchange messages to open and confirm the connection parameters.
    179    The initial data flow is the entire BGP routing table.  Incremental
    180    updates are sent as the routing tables change.  BGP does not require
    181    periodic refresh of the entire BGP routing table.  Therefore, a BGP
    182    speaker must retain the current version of the entire BGP routing
    183    tables of all of its peers for the duration of the connection.
    184    KeepAlive messages are sent periodically to ensure the liveness of
    185    the connection.  Notification messages are sent in response to errors
    186    or special conditions.  If a connection encounters an error
    187    condition, a notification message is sent and the connection is
    188    closed.
    189 
    190    The hosts executing the Border Gateway Protocol need not be routers.
    191    A non-routing host could exchange routing information with routers
    192    via EGP or even an interior routing protocol.  That non-routing host
    193    could then use BGP to exchange routing information with a border
    194    router in another Autonomous System.  The implications and
    195    applications of this architecture are for further study.
    196 
    197    If a particular AS has multiple BGP speakers and is providing transit
    198    service for other ASs, then care must be taken to ensure a consistent
    199    view of routing within the AS.  A consistent view of the interior
    200    routes of the AS is provided by the interior routing protocol.  A
    201    consistent view of the routes exterior to the AS can be provided by
    202    having all BGP speakers within the AS maintain direct BGP connections
    203    with each other.  Using a common set of policies, the BGP speakers
    204    arrive at an agreement as to which border routers will serve as
    205    exit/entry points for particular destinations outside the AS.  This
    206    information is communicated to the AS's internal routers, possibly
    207    via the interior routing protocol.  Care must be taken to ensure that
    208    the interior routers have all been updated with transit information
    209    before the BGP speakers announce to other ASs that transit service is
    210    being provided.
    211 
    212    Connections between BGP speakers of different ASs are referred to as
    213    "external" links.  BGP connections between BGP speakers within the
    214    same AS are referred to as "internal" links.  Similarly, a peer in a
    215    different AS is referred to as an external peer, while a peer in the
    216    same AS may be described as an internal peer.
    217 
    218 
    219 
    220 
    221 
    222 
    223 
    224 
    225 
    226 Rekhter & Li                                                    [Page 4]
    227 
    228 RFC 1771                         BGP-4                        March 1995
    229 
    230 
    231 3.1 Routes: Advertisement and Storage
    232 
    233    For purposes of this protocol a route is defined as a unit of
    234    information that pairs a destination with the attributes of a path to
    235    that destination:
    236 
    237       - Routes are advertised between a pair of BGP speakers in UPDATE
    238       messages:  the destination is the systems whose IP addresses are
    239       reported in the Network Layer Reachability Information (NLRI)
    240       field, and the the path is the information reported in the path
    241       attributes fields of the same UPDATE message.
    242 
    243       - Routes are stored in the Routing Information Bases (RIBs):
    244       namely, the Adj-RIBs-In, the Loc-RIB, and the Adj-RIBs-Out. Routes
    245       that will be advertised to other BGP speakers must be present in
    246       the Adj-RIB-Out; routes that will be used by the local BGP speaker
    247       must be present in the Loc-RIB, and the next hop for each of these
    248       routes must be present in the local BGP speaker's forwarding
    249       information base; and routes that are received from other BGP
    250       speakers are present in the Adj-RIBs-In.
    251 
    252    If a BGP speaker chooses to advertise the route, it may add to or
    253    modify the path attributes of the route before advertising it to a
    254    peer.
    255 
    256    BGP provides mechanisms by which a BGP speaker can inform its peer
    257    that a previously advertised route is no longer available for use.
    258    There are three methods by which a given BGP speaker can indicate
    259    that a route has been withdrawn from service:
    260 
    261       a) the IP prefix that expresses destinations for a previously
    262       advertised route can be advertised in the WITHDRAWN ROUTES field
    263       in the UPDATE message, thus marking the associated route as being
    264       no longer available for use
    265 
    266       b) a replacement route with the same Network Layer Reachability
    267       Information can be advertised, or
    268 
    269       c) the BGP speaker - BGP speaker connection can be closed, which
    270       implicitly removes from service all routes which the pair of
    271       speakers had advertised to each other.
    272 
    273 
    274 
    275 
    276 
    277 
    278 
    279 
    280 
    281 
    282 Rekhter & Li                                                    [Page 5]
    283 
    284 RFC 1771                         BGP-4                        March 1995
    285 
    286 
    287 3.2 Routing Information Bases
    288 
    289    The Routing Information Base (RIB) within a BGP speaker consists of
    290    three distinct parts:
    291 
    292       a) Adj-RIBs-In: The Adj-RIBs-In store routing information that has
    293       been learned from inbound UPDATE messages. Their contents
    294       represent routes that are available as an input to the Decision
    295       Process.
    296 
    297       b) Loc-RIB: The Loc-RIB contains the local routing information
    298       that the BGP speaker has selected by applying its local policies
    299       to the routing information contained in its Adj-RIBs-In.
    300 
    301       c) Adj-RIBs-Out: The Adj-RIBs-Out store the information that the
    302       local BGP speaker has selected for advertisement to its peers. The
    303       routing information stored in the Adj-RIBs-Out will be carried in
    304       the local BGP speaker's UPDATE messages and advertised to its
    305       peers.
    306 
    307    In summary, the Adj-RIBs-In contain unprocessed routing information
    308    that has been advertised to the local BGP speaker by its peers; the
    309    Loc-RIB contains the routes that have been selected by the local BGP
    310    speaker's Decision Process; and the Adj-RIBs-Out organize the routes
    311    for advertisement to specific peers by means of the local speaker's
    312    UPDATE messages.
    313 
    314    Although the conceptual model distinguishes between Adj-RIBs-In,
    315    Loc-RIB, and Adj-RIBs-Out, this neither implies nor requires that an
    316    implementation must maintain three separate copies of the routing
    317    information. The choice of implementation (for example, 3 copies of
    318    the information vs 1 copy with pointers) is not constrained by the
    319    protocol.
    320 
    321 4.  Message Formats
    322 
    323    This section describes message formats used by BGP.
    324 
    325    Messages are sent over a reliable transport protocol connection.  A
    326    message is processed only after it is entirely received.  The maximum
    327    message size is 4096 octets.  All implementations are required to
    328    support this maximum message size.  The smallest message that may be
    329    sent consists of a BGP header without a data portion, or 19 octets.
    330 
    331 
    332 
    333 
    334 
    335 
    336 
    337 
    338 Rekhter & Li                                                    [Page 6]
    339 
    340 RFC 1771                         BGP-4                        March 1995
    341 
    342 
    343 4.1 Message Header Format
    344 
    345    Each message has a fixed-size header.  There may or may not be a data
    346    portion following the header, depending on the message type.  The
    347    layout of these fields is shown below:
    348 
    349        0                   1                   2                   3
    350        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    351       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    352       |                                                               |
    353       +                                                               +
    354       |                                                               |
    355       +                                                               +
    356       |                           Marker                              |
    357       +                                                               +
    358       |                                                               |
    359       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    360       |          Length               |      Type     |
    361       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    362 
    363       Marker:
    364 
    365          This 16-octet field contains a value that the receiver of the
    366          message can predict.  If the Type of the message is OPEN, or if
    367          the OPEN message carries no Authentication Information (as an
    368          Optional Parameter), then the Marker must be all ones.
    369          Otherwise, the value of the marker can be predicted by some a
    370          computation specified as part of the authentication mechanism
    371          (which is specified as part of the Authentication Information)
    372          used.  The Marker can be used to detect loss of synchronization
    373          between a pair of BGP peers, and to authenticate incoming BGP
    374          messages.
    375 
    376       Length:
    377 
    378          This 2-octet unsigned integer indicates the total length of the
    379          message, including the header, in octets.  Thus, e.g., it
    380          allows one to locate in the transport-level stream the (Marker
    381          field of the) next message.  The value of the Length field must
    382          always be at least 19 and no greater than 4096, and may be
    383          further constrained, depending on the message type.  No
    384          "padding" of extra data after the message is allowed, so the
    385          Length field must have the smallest value required given the
    386          rest of the message.
    387 
    388 
    389 
    390 
    391 
    392 
    393 
    394 Rekhter & Li                                                    [Page 7]
    395 
    396 RFC 1771                         BGP-4                        March 1995
    397 
    398 
    399       Type:
    400 
    401          This 1-octet unsigned integer indicates the type code of the
    402          message.  The following type codes are defined:
    403 
    404                                     1 - OPEN
    405                                     2 - UPDATE
    406                                     3 - NOTIFICATION
    407                                     4 - KEEPALIVE
    408 
    409 4.2 OPEN Message Format
    410 
    411    After a transport protocol connection is established, the first
    412    message sent by each side is an OPEN message.  If the OPEN message is
    413    acceptable, a KEEPALIVE message confirming the OPEN is sent back.
    414    Once the OPEN is confirmed, UPDATE, KEEPALIVE, and NOTIFICATION
    415    messages may be exchanged.
    416 
    417    In addition to the fixed-size BGP header, the OPEN message contains
    418    the following fields:
    419 
    420         0                   1                   2                   3
    421        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    422        +-+-+-+-+-+-+-+-+
    423        |    Version    |
    424        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    425        |     My Autonomous System      |
    426        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    427        |           Hold Time           |
    428        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    429        |                         BGP Identifier                        |
    430        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    431        | Opt Parm Len  |
    432        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    433        |                                                               |
    434        |                       Optional Parameters                     |
    435        |                                                               |
    436        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    437 
    438       Version:
    439 
    440          This 1-octet unsigned integer indicates the protocol version
    441          number of the message.  The current BGP version number is 4.
    442 
    443       My Autonomous System:
    444 
    445          This 2-octet unsigned integer indicates the Autonomous System
    446          number of the sender.
    447 
    448 
    449 
    450 Rekhter & Li                                                    [Page 8]
    451 
    452 RFC 1771                         BGP-4                        March 1995
    453 
    454 
    455       Hold Time:
    456 
    457          This 2-octet unsigned integer indicates the number of seconds
    458          that the sender proposes for the value of the Hold Timer.  Upon
    459          receipt of an OPEN message, a BGP speaker MUST calculate the
    460          value of the Hold Timer by using the smaller of its configured
    461          Hold Time and the Hold Time received in the OPEN message.  The
    462          Hold Time MUST be either zero or at least three seconds.  An
    463          implementation may reject connections on the basis of the Hold
    464          Time.  The calculated value indicates the maximum number of
    465          seconds that may elapse between the receipt of successive
    466          KEEPALIVE, and/or UPDATE messages by the sender.
    467 
    468       BGP Identifier:
    469 
    470          This 4-octet unsigned integer indicates the BGP Identifier of
    471          the sender. A given BGP speaker sets the value of its BGP
    472          Identifier to an IP address assigned to that BGP speaker.  The
    473          value of the BGP Identifier is determined on startup and is the
    474          same for every local interface and every BGP peer.
    475 
    476       Optional Parameters Length:
    477 
    478          This 1-octet unsigned integer indicates the total length of the
    479          Optional Parameters field in octets. If the value of this field
    480          is zero, no Optional Parameters are present.
    481 
    482       Optional Parameters:
    483 
    484          This field may contain a list of optional parameters, where
    485          each parameter is encoded as a <Parameter Type, Parameter
    486          Length, Parameter Value> triplet.
    487 
    488           0                   1
    489           0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
    490          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...
    491          |  Parm. Type   | Parm. Length  |  Parameter Value (variable)
    492          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...
    493 
    494          Parameter Type is a one octet field that unambiguously
    495          identifies individual parameters. Parameter Length is a one
    496          octet field that contains the length of the Parameter Value
    497          field in octets.  Parameter Value is a variable length field
    498          that is interpreted according to the value of the Parameter
    499          Type field.
    500 
    501 
    502 
    503 
    504 
    505 
    506 Rekhter & Li                                                    [Page 9]
    507 
    508 RFC 1771                         BGP-4                        March 1995
    509 
    510 
    511          This document defines the following Optional Parameters:
    512 
    513          a) Authentication Information (Parameter Type 1):
    514 
    515             This optional parameter may be used to authenticate a BGP
    516             peer. The Parameter Value field contains a 1-octet
    517             Authentication Code followed by a variable length
    518             Authentication Data.
    519 
    520                 0 1 2 3 4 5 6 7 8
    521                 +-+-+-+-+-+-+-+-+
    522                 |  Auth. Code   |
    523                 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    524                 |                                                     |
    525                 |              Authentication Data                    |
    526                 |                                                     |
    527                 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    528 
    529                Authentication Code:
    530 
    531                   This 1-octet unsigned integer indicates the
    532                   authentication mechanism being used.  Whenever an
    533                   authentication mechanism is specified for use within
    534                   BGP, three things must be included in the
    535                   specification:
    536 
    537                   - the value of the Authentication Code which indicates
    538                   use of the mechanism,
    539                   - the form and meaning of the Authentication Data, and
    540                   - the algorithm for computing values of Marker fields.
    541 
    542                   Note that a separate authentication mechanism may be
    543                   used in establishing the transport level connection.
    544 
    545                Authentication Data:
    546 
    547                   The form and meaning of this field is a variable-
    548                   length field depend on the Authentication Code.
    549 
    550          The minimum length of the OPEN message is 29 octets (including
    551          message header).
    552 
    553 
    554 
    555 
    556 
    557 
    558 
    559 
    560 
    561 
    562 Rekhter & Li                                                   [Page 10]
    563 
    564 RFC 1771                         BGP-4                        March 1995
    565 
    566 
    567 4.3 UPDATE Message Format
    568 
    569    UPDATE messages are used to transfer routing information between BGP
    570    peers.  The information in the UPDATE packet can be used to construct
    571    a graph describing the relationships of the various Autonomous
    572    Systems.  By applying rules to be discussed, routing information
    573    loops and some other anomalies may be detected and removed from
    574    inter-AS routing.
    575 
    576    An UPDATE message is used to advertise a single feasible route to a
    577    peer, or to withdraw multiple unfeasible routes from service (see
    578    3.1). An UPDATE message may simultaneously advertise a feasible route
    579    and withdraw multiple unfeasible routes from service.  The UPDATE
    580    message always includes the fixed-size BGP header, and can optionally
    581    include the other fields as shown below:
    582 
    583       +-----------------------------------------------------+
    584       |   Unfeasible Routes Length (2 octets)               |
    585       +-----------------------------------------------------+
    586       |  Withdrawn Routes (variable)                        |
    587       +-----------------------------------------------------+
    588       |   Total Path Attribute Length (2 octets)            |
    589       +-----------------------------------------------------+
    590       |    Path Attributes (variable)                       |
    591       +-----------------------------------------------------+
    592       |   Network Layer Reachability Information (variable) |
    593       +-----------------------------------------------------+
    594 
    595       Unfeasible Routes Length:
    596 
    597          This 2-octets unsigned integer indicates the total length of
    598          the Withdrawn Routes field in octets.  Its value must allow the
    599          length of the Network Layer Reachability Information field to
    600          be determined as specified below.
    601 
    602          A value of 0 indicates that no routes are being withdrawn from
    603          service, and that the WITHDRAWN ROUTES field is not present in
    604          this UPDATE message.
    605 
    606       Withdrawn Routes:
    607 
    608          This is a variable length field that contains a list of IP
    609          address prefixes for the routes that are being withdrawn from
    610          service.  Each IP address prefix is encoded as a 2-tuple of the
    611          form <length, prefix>, whose fields are described below:
    612 
    613 
    614 
    615 
    616 
    617 
    618 Rekhter & Li                                                   [Page 11]
    619 
    620 RFC 1771                         BGP-4                        March 1995
    621 
    622 
    623                   +---------------------------+
    624                   |   Length (1 octet)        |
    625                   +---------------------------+
    626                   |   Prefix (variable)       |
    627                   +---------------------------+
    628 
    629          The use and the meaning of these fields are as follows:
    630 
    631          a) Length:
    632 
    633             The Length field indicates the length in bits of the IP
    634             address prefix. A length of zero indicates a prefix that
    635             matches all IP addresses (with prefix, itself, of zero
    636             octets).
    637 
    638          b) Prefix:
    639 
    640             The Prefix field contains IP address prefixes followed by
    641             enough trailing bits to make the end of the field fall on an
    642             octet boundary. Note that the value of trailing bits is
    643             irrelevant.
    644 
    645       Total Path Attribute Length:
    646 
    647          This 2-octet unsigned integer indicates the total length of the
    648          Path Attributes field in octets.  Its value must allow the
    649          length of the Network Layer Reachability field to be determined
    650          as specified below.
    651 
    652          A value of 0 indicates that no Network Layer Reachability
    653          Information field is present in this UPDATE message.
    654 
    655       Path Attributes:
    656 
    657          A variable length sequence of path attributes is present in
    658          every UPDATE.  Each path attribute is a triple <attribute type,
    659          attribute length, attribute value> of variable length.
    660 
    661          Attribute Type is a two-octet field that consists of the
    662          Attribute Flags octet followed by the Attribute Type Code
    663          octet.
    664 
    665                 0                   1
    666                 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
    667                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    668                |  Attr. Flags  |Attr. Type Code|
    669                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    670 
    671 
    672 
    673 
    674 Rekhter & Li                                                   [Page 12]
    675 
    676 RFC 1771                         BGP-4                        March 1995
    677 
    678 
    679          The high-order bit (bit 0) of the Attribute Flags octet is the
    680          Optional bit.  It defines whether the attribute is optional (if
    681          set to 1) or well-known (if set to 0).
    682 
    683          The second high-order bit (bit 1) of the Attribute Flags octet
    684          is the Transitive bit.  It defines whether an optional
    685          attribute is transitive (if set to 1) or non-transitive (if set
    686          to 0).  For well-known attributes, the Transitive bit must be
    687          set to 1.  (See Section 5 for a discussion of transitive
    688          attributes.)
    689 
    690          The third high-order bit (bit 2) of the Attribute Flags octet
    691          is the Partial bit.  It defines whether the information
    692          contained in the optional transitive attribute is partial (if
    693          set to 1) or complete (if set to 0).  For well-known attributes
    694          and for optional non-transitive attributes the Partial bit must
    695          be set to 0.
    696 
    697          The fourth high-order bit (bit 3) of the Attribute Flags octet
    698          is the Extended Length bit.  It defines whether the Attribute
    699          Length is one octet (if set to 0) or two octets (if set to 1).
    700          Extended Length may be used only if the length of the attribute
    701          value is greater than 255 octets.
    702 
    703          The lower-order four bits of the Attribute Flags octet are .
    704          unused. They must be zero (and must be ignored when received).
    705 
    706          The Attribute Type Code octet contains the Attribute Type Code.
    707          Currently defined Attribute Type Codes are discussed in Section
    708          5.
    709 
    710          If the Extended Length bit of the Attribute Flags octet is set
    711          to 0, the third octet of the Path Attribute contains the length
    712          of the attribute data in octets.
    713 
    714          If the Extended Length bit of the Attribute Flags octet is set
    715          to 1, then the third and the fourth octets of the path
    716          attribute contain the length of the attribute data in octets.
    717 
    718          The remaining octets of the Path Attribute represent the
    719          attribute value and are interpreted according to the Attribute
    720          Flags and the Attribute Type Code. The supported Attribute Type
    721          Codes, their attribute values and uses are the following:
    722 
    723 
    724 
    725 
    726 
    727 
    728 
    729 
    730 Rekhter & Li                                                   [Page 13]
    731 
    732 RFC 1771                         BGP-4                        March 1995
    733 
    734 
    735          a)   ORIGIN (Type Code 1):
    736 
    737             ORIGIN is a well-known mandatory attribute that defines the
    738             origin of the path information.   The data octet can assume
    739             the following values:
    740 
    741                   Value      Meaning
    742 
    743                   0         IGP - Network Layer Reachability Information
    744                                is interior to the originating AS
    745 
    746                   1         EGP - Network Layer Reachability Information
    747                                learned via EGP
    748 
    749                   2         INCOMPLETE - Network Layer Reachability
    750                                Information learned by some other means
    751 
    752             Its usage is defined in 5.1.1
    753 
    754          b) AS_PATH (Type Code 2):
    755 
    756             AS_PATH is a well-known mandatory attribute that is composed
    757             of a sequence of AS path segments. Each AS path segment is
    758             represented by a triple <path segment type, path segment
    759             length, path segment value>.
    760 
    761 
    762 
    763 
    764 
    765 
    766 
    767 
    768 
    769 
    770 
    771 
    772 
    773 
    774 
    775 
    776 
    777 
    778 
    779 
    780 
    781 
    782 
    783 
    784 
    785 
    786 Rekhter & Li                                                   [Page 14]
    787 
    788 RFC 1771                         BGP-4                        March 1995
    789 
    790 
    791             The path segment type is a 1-octet long field with the
    792             following values defined:
    793 
    794                   Value      Segment Type
    795 
    796                   1         AS_SET: unordered set of ASs a route in the
    797                                UPDATE message has traversed
    798 
    799                   2         AS_SEQUENCE: ordered set of ASs a route in
    800                                the UPDATE message has traversed
    801 
    802             The path segment length is a 1-octet long field containing
    803             the number of ASs in the path segment value field.
    804 
    805             The path segment value field contains one or more AS
    806             numbers, each encoded as a 2-octets long field.
    807 
    808             Usage of this attribute is defined in 5.1.2.
    809 
    810          c)   NEXT_HOP (Type Code 3):
    811 
    812             This is a well-known mandatory attribute that defines the IP
    813             address of the border router that should be used as the next
    814             hop to the destinations listed in the Network Layer
    815             Reachability field of the UPDATE message.
    816 
    817             Usage of this attribute is defined in 5.1.3.
    818 
    819          d) MULTI_EXIT_DISC (Type Code 4):
    820 
    821             This is an optional non-transitive attribute that is a four
    822             octet non-negative integer. The value of this attribute may
    823             be used by a BGP speaker's decision process to discriminate
    824             among multiple exit points to a neighboring autonomous
    825             system.
    826 
    827             Its usage is defined in 5.1.4.
    828 
    829          e) LOCAL_PREF (Type Code 5):
    830 
    831             LOCAL_PREF is a well-known discretionary attribute that is a
    832             four octet non-negative integer. It is used by a BGP speaker
    833             to inform other BGP speakers in its own autonomous system of
    834             the originating speaker's degree of preference for an
    835             advertised route. Usage of this attribute is described in
    836             5.1.5.
    837 
    838 
    839 
    840 
    841 
    842 Rekhter & Li                                                   [Page 15]
    843 
    844 RFC 1771                         BGP-4                        March 1995
    845 
    846 
    847          f) ATOMIC_AGGREGATE (Type Code 6)
    848 
    849             ATOMIC_AGGREGATE is a well-known discretionary attribute of
    850             length 0. It is used by a BGP speaker to inform other BGP
    851             speakers that the local system selected a less specific
    852             route without selecting a more specific route which is
    853             included in it. Usage of this attribute is described in
    854             5.1.6.
    855 
    856          g) AGGREGATOR (Type Code 7)
    857 
    858             AGGREGATOR is an optional transitive attribute of length 6.
    859             The attribute contains the last AS number that formed the
    860             aggregate route (encoded as 2 octets), followed by the IP
    861             address of the BGP speaker that formed the aggregate route
    862             (encoded as 4 octets).  Usage of this attribute is described
    863             in 5.1.7
    864 
    865       Network Layer Reachability Information:
    866 
    867          This variable length field contains a list of IP address
    868          prefixes.  The length in octets of the Network Layer
    869          Reachability Information is not encoded explicitly, but can be
    870          calculated as:
    871 
    872             UPDATE message Length - 23 - Total Path Attributes Length -
    873             Unfeasible Routes Length
    874 
    875          where UPDATE message Length is the value encoded in the fixed-
    876          size BGP header, Total Path Attribute Length and Unfeasible
    877          Routes Length  are the values encoded in the variable part of
    878          the UPDATE message, and 23 is a combined length of the fixed-
    879          size BGP header, the Total Path Attribute Length field and the
    880          Unfeasible Routes Length field.
    881 
    882          Reachability information is encoded as one or more 2-tuples of
    883          the form <length, prefix>, whose fields are described below:
    884 
    885                   +---------------------------+
    886                   |   Length (1 octet)        |
    887                   +---------------------------+
    888                   |   Prefix (variable)       |
    889                   +---------------------------+
    890 
    891 
    892 
    893 
    894 
    895 
    896 
    897 
    898 Rekhter & Li                                                   [Page 16]
    899 
    900 RFC 1771                         BGP-4                        March 1995
    901 
    902 
    903          The use and the meaning of these fields are as follows:
    904 
    905          a) Length:
    906 
    907             The Length field indicates the length in bits of the IP
    908             address prefix. A length of zero indicates a prefix that
    909             matches all IP addresses (with prefix, itself, of zero
    910             octets).
    911 
    912          b) Prefix:
    913 
    914             The Prefix field contains IP address prefixes followed by
    915             enough trailing bits to make the end of the field fall on an
    916             octet boundary. Note that the value of the trailing bits is
    917             irrelevant.
    918 
    919    The minimum length of the UPDATE message is 23 octets -- 19 octets
    920    for the fixed header + 2 octets for the Unfeasible Routes Length + 2
    921    octets for the Total Path Attribute Length (the value of Unfeasible
    922    Routes Length is 0  and the value of Total Path Attribute Length is
    923    0).
    924 
    925    An UPDATE message can advertise at most one route, which may be
    926    described by several path attributes. All path attributes contained
    927    in a given UPDATE messages apply to the destinations carried in the
    928    Network Layer Reachability Information field of the UPDATE message.
    929 
    930    An UPDATE message can list multiple routes to be withdrawn from
    931    service.  Each such route is identified by its destination (expressed
    932    as an IP prefix), which unambiguously identifies the route in the
    933    context of the BGP speaker - BGP speaker connection to which it has
    934    been previously been advertised.
    935 
    936    An UPDATE message may advertise only routes to be withdrawn from
    937    service, in which case it will not include path attributes or Network
    938    Layer Reachability Information. Conversely, it may advertise only a
    939    feasible route, in which case the WITHDRAWN ROUTES field need not be
    940    present.
    941 
    942 4.4 KEEPALIVE Message Format
    943 
    944    BGP does not use any transport protocol-based keep-alive mechanism to
    945    determine if peers are reachable.  Instead, KEEPALIVE messages are
    946    exchanged between peers often enough as not to cause the Hold Timer
    947    to expire.  A reasonable maximum time between KEEPALIVE messages
    948    would be one third of the Hold Time interval.  KEEPALIVE messages
    949    MUST NOT be sent more frequently than one per second.  An
    950    implementation MAY adjust the rate at which it sends KEEPALIVE
    951 
    952 
    953 
    954 Rekhter & Li                                                   [Page 17]
    955 
    956 RFC 1771                         BGP-4                        March 1995
    957 
    958 
    959    messages as a function of the Hold Time interval.
    960 
    961    If the negotiated Hold Time interval is zero, then periodic KEEPALIVE
    962    messages MUST NOT be sent.
    963 
    964    KEEPALIVE message consists of only message header and has a length of
    965    19 octets.
    966 
    967 4.5 NOTIFICATION Message Format
    968 
    969    A NOTIFICATION message is sent when an error condition is detected.
    970    The BGP connection is closed immediately after sending it.
    971 
    972    In addition to the fixed-size BGP header, the NOTIFICATION message
    973    contains the following fields:
    974 
    975         0                   1                   2                   3
    976         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    977        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    978        | Error code    | Error subcode |           Data                |
    979        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
    980        |                                                               |
    981        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    982 
    983       Error Code:
    984 
    985          This 1-octet unsigned integer indicates the type of
    986          NOTIFICATION.  The following Error Codes have been defined:
    987 
    988             Error Code       Symbolic Name               Reference
    989 
    990               1         Message Header Error             Section 6.1
    991 
    992               2         OPEN Message Error               Section 6.2
    993 
    994               3         UPDATE Message Error             Section 6.3
    995 
    996               4         Hold Timer Expired               Section 6.5
    997 
    998               5         Finite State Machine Error       Section 6.6
    999 
   1000               6         Cease                            Section 6.7
   1001 
   1002       Error subcode:
   1003 
   1004          This 1-octet unsigned integer provides more specific
   1005          information about the nature of the reported error.  Each Error
   1006          Code may have one or more Error Subcodes associated with it.
   1007 
   1008 
   1009 
   1010 Rekhter & Li                                                   [Page 18]
   1011 
   1012 RFC 1771                         BGP-4                        March 1995
   1013 
   1014 
   1015          If no appropriate Error Subcode is defined, then a zero
   1016          (Unspecific) value is used for the Error Subcode field.
   1017 
   1018          Message Header Error subcodes:
   1019 
   1020                                1  - Connection Not Synchronized.
   1021                                2  - Bad Message Length.
   1022                                3  - Bad Message Type.
   1023 
   1024          OPEN Message Error subcodes:
   1025 
   1026                                1  - Unsupported Version Number.
   1027                                2  - Bad Peer AS.
   1028                                3  - Bad BGP Identifier. '
   1029          4  - Unsupported Optional Parameter.
   1030                                5  - Authentication Failure.
   1031                                            6  - Unacceptable Hold Time.
   1032 
   1033          UPDATE Message Error subcodes:
   1034 
   1035                                1 - Malformed Attribute List.
   1036                                2 - Unrecognized Well-known Attribute.
   1037                                3 - Missing Well-known Attribute.
   1038                                4 - Attribute Flags Error.
   1039                                5 - Attribute Length Error.
   1040                                6 - Invalid ORIGIN Attribute
   1041                                7 - AS Routing Loop.
   1042                                8 - Invalid NEXT_HOP Attribute.
   1043                                9 - Optional Attribute Error.
   1044                               10 - Invalid Network Field.
   1045                               11 - Malformed AS_PATH.
   1046 
   1047       Data:
   1048 
   1049          This variable-length field is used to diagnose the reason for
   1050          the NOTIFICATION.  The contents of the Data field depend upon
   1051          the Error Code and Error Subcode.  See Section 6 below for more
   1052          details.
   1053 
   1054          Note that the length of the Data field can be determined from
   1055          the message Length field by the formula:
   1056 
   1057                   Message Length = 21 + Data Length
   1058 
   1059    The minimum length of the NOTIFICATION message is 21 octets
   1060    (including message header).
   1061 
   1062 
   1063 
   1064 
   1065 
   1066 Rekhter & Li                                                   [Page 19]
   1067 
   1068 RFC 1771                         BGP-4                        March 1995
   1069 
   1070 
   1071 5.  Path Attributes
   1072 
   1073    This section discusses the path attributes of the UPDATE message.
   1074 
   1075    Path attributes fall into four separate categories:
   1076 
   1077                1. Well-known mandatory.
   1078                2. Well-known discretionary.
   1079                3. Optional transitive.
   1080                4. Optional non-transitive.
   1081 
   1082    Well-known attributes must be recognized by all BGP implementations.
   1083    Some of these attributes are mandatory and must be included in every
   1084    UPDATE message.  Others are discretionary and may or may not be sent
   1085    in a particular UPDATE message.
   1086 
   1087    All well-known attributes must be passed along (after proper
   1088    updating, if necessary) to other BGP peers.
   1089 
   1090    In addition to well-known attributes, each path may contain one or
   1091    more optional attributes.  It is not required or expected that all
   1092    BGP implementations support all optional attributes.  The handling of
   1093    an unrecognized optional attribute is determined by the setting of
   1094    the Transitive bit in the attribute flags octet.  Paths with
   1095    unrecognized transitive optional attributes should be accepted. If a
   1096    path with unrecognized transitive optional attribute is accepted and
   1097    passed along to other BGP peers, then the unrecognized transitive
   1098    optional attribute of that path must be passed along with the path to
   1099    other BGP peers with the Partial bit in the Attribute Flags octet set
   1100    to 1. If a path with recognized transitive optional attribute is
   1101    accepted and passed along to other BGP peers and the Partial bit in
   1102    the Attribute Flags octet is set to 1 by some previous AS, it is not
   1103    set back to 0 by the current AS. Unrecognized non-transitive optional
   1104    attributes must be quietly ignored and not passed along to other BGP
   1105    peers.
   1106 
   1107    New transitive optional attributes may be attached to the path by the
   1108    originator or by any other AS in the path.  If they are not attached
   1109    by the originator, the Partial bit in the Attribute Flags octet is
   1110    set to 1.  The rules for attaching new non-transitive optional
   1111    attributes will depend on the nature of the specific attribute.  The
   1112    documentation of each new non-transitive optional attribute will be
   1113    expected to include such rules.  (The description of the
   1114    MULTI_EXIT_DISC attribute gives an example.)  All optional attributes
   1115    (both transitive and non-transitive) may be updated (if appropriate)
   1116    by ASs in the path.
   1117 
   1118 
   1119 
   1120 
   1121 
   1122 Rekhter & Li                                                   [Page 20]
   1123 
   1124 RFC 1771                         BGP-4                        March 1995
   1125 
   1126 
   1127    The sender of an UPDATE message should order path attributes within
   1128    the UPDATE message in ascending order of attribute type.  The
   1129    receiver of an UPDATE message must be prepared to handle path
   1130    attributes within the UPDATE message that are out of order.
   1131 
   1132    The same attribute cannot appear more than once within the Path
   1133    Attributes field of a particular UPDATE message.
   1134 
   1135 5.1 Path Attribute Usage
   1136 
   1137    The usage of each BGP path attributes is described in the following
   1138    clauses.
   1139 
   1140 5.1.1 ORIGIN
   1141 
   1142    ORIGIN is a well-known mandatory attribute.  The ORIGIN attribute
   1143    shall be generated by the autonomous system that originates the
   1144    associated routing information. It shall be included in the UPDATE
   1145    messages of all BGP speakers that choose to propagate this
   1146    information to other BGP speakers.
   1147 
   1148 5.1.2   AS_PATH
   1149 
   1150    AS_PATH is a well-known mandatory attribute. This attribute
   1151    identifies the autonomous systems through which routing information
   1152    carried in this UPDATE message has passed. The components of this
   1153    list can be AS_SETs or AS_SEQUENCEs.
   1154 
   1155    When a BGP speaker propagates a route which it has learned from
   1156    another BGP speaker's UPDATE message, it shall modify the route's
   1157    AS_PATH attribute based on the location of the BGP speaker to which
   1158    the route will be sent:
   1159 
   1160       a) When a given BGP speaker advertises the route to another BGP
   1161       speaker located in its own autonomous system, the advertising
   1162       speaker shall not modify the AS_PATH attribute associated with the
   1163       route.
   1164 
   1165       b) When a given BGP speaker advertises the route to a BGP speaker
   1166       located in a neighboring autonomous system, then the advertising
   1167       speaker shall update the AS_PATH attribute as follows:
   1168 
   1169          1) if the first path segment of the AS_PATH is of type
   1170          AS_SEQUENCE, the local system shall prepend its own AS number
   1171          as the last element of the sequence (put it in the leftmost
   1172          position).
   1173 
   1174 
   1175 
   1176 
   1177 
   1178 Rekhter & Li                                                   [Page 21]
   1179 
   1180 RFC 1771                         BGP-4                        March 1995
   1181 
   1182 
   1183          2) if the first path segment of the AS_PATH is of type AS_SET,
   1184          the local system shall prepend a new path segment of type
   1185          AS_SEQUENCE to the AS_PATH, including its own AS number in that
   1186          segment.
   1187 
   1188       When a BGP speaker originates a route then:
   1189 
   1190          a) the originating speaker shall include its own AS number in
   1191          the AS_PATH attribute of all UPDATE messages sent to BGP
   1192          speakers located in neighboring autonomous systems. (In this
   1193          case, the AS number of the originating speaker's autonomous
   1194          system will be the only entry in the AS_PATH attribute).
   1195 
   1196          b) the originating speaker shall include an empty AS_PATH
   1197          attribute in all UPDATE messages sent to BGP speakers located
   1198          in its own autonomous system. (An empty AS_PATH attribute is
   1199          one whose length field contains the value zero).
   1200 
   1201 5.1.3 NEXT_HOP
   1202 
   1203    The NEXT_HOP path attribute defines the IP address of the border
   1204    router that should be used as the next hop to the destinations listed
   1205    in the UPDATE message.  If a border router belongs to the same AS as
   1206    its peer, then the peer is an internal border router. Otherwise, it
   1207    is an external border router.  A BGP speaker can advertise any
   1208    internal border router as the next hop provided that the interface
   1209    associated with the IP address of this border router (as specified in
   1210    the NEXT_HOP path attribute) shares a common subnet with both the
   1211    local and remote BGP speakers. A BGP speaker can advertise any
   1212    external border router as the next hop, provided that the IP address
   1213    of this border router was learned from one of the BGP speaker's
   1214    peers, and the interface associated with the IP address of this
   1215    border router (as specified in the NEXT_HOP path attribute) shares a
   1216    common subnet with the local and remote BGP speakers.  A BGP speaker
   1217    needs to be able to support disabling advertisement of external
   1218    border routers.
   1219 
   1220    A BGP speaker must never advertise an address of a peer to that peer
   1221    as a NEXT_HOP, for a route that the speaker is originating.  A BGP
   1222    speaker must never install a route with itself as the next hop.
   1223 
   1224    When a BGP speaker advertises the route to a BGP speaker located in
   1225    its own autonomous system, the advertising speaker shall not modify
   1226    the NEXT_HOP attribute associated with the route.  When a BGP speaker
   1227    receives the route via an internal link, it may forward packets to
   1228    the NEXT_HOP address if the address contained in the attribute is on
   1229    a common subnet with the local and remote BGP speakers.
   1230 
   1231 
   1232 
   1233 
   1234 Rekhter & Li                                                   [Page 22]
   1235 
   1236 RFC 1771                         BGP-4                        March 1995
   1237 
   1238 
   1239 5.1.4   MULTI_EXIT_DISC
   1240 
   1241    The MULTI_EXIT_DISC attribute may be used on external (inter-AS)
   1242    links to discriminate among multiple exit or entry points to the same
   1243    neighboring AS.  The value of the MULTI_EXIT_DISC attribute is a four
   1244    octet unsigned number which is called a metric.  All other factors
   1245    being equal, the exit or entry point with lower metric should be
   1246    preferred.  If received over external links, the MULTI_EXIT_DISC
   1247    attribute may be propagated over internal links to other BGP speakers
   1248    within the same AS.  The MULTI_EXIT_DISC attribute is never
   1249    propagated to other BGP speakers in neighboring AS's.
   1250 
   1251 5.1.5   LOCAL_PREF
   1252 
   1253    LOCAL_PREF is a well-known discretionary attribute that shall be
   1254    included in all UPDATE messages that a given BGP speaker sends to the
   1255    other BGP speakers located in its own autonomous system. A BGP
   1256    speaker shall calculate the degree of preference for each external
   1257    route and include the degree of preference when advertising a route
   1258    to its internal peers. The higher degree of preference should be
   1259    preferred. A BGP speaker shall use the degree of preference learned
   1260    via LOCAL_PREF in its decision process (see section 9.1.1).
   1261 
   1262    A BGP speaker shall not include this attribute in UPDATE messages
   1263    that it sends to BGP speakers located in a neighboring autonomous
   1264    system. If it is contained in an UPDATE message that is received from
   1265    a BGP speaker which is not located in the same autonomous system as
   1266    the receiving speaker, then this attribute shall be ignored by the
   1267    receiving speaker.
   1268 
   1269 5.1.6   ATOMIC_AGGREGATE
   1270 
   1271    ATOMIC_AGGREGATE is a well-known discretionary attribute.  If a BGP
   1272    speaker, when presented with a set of overlapping routes from one of
   1273    its peers (see 9.1.4), selects the less specific route without
   1274    selecting the more specific one, then the local system shall attach
   1275    the ATOMIC_AGGREGATE attribute to the route when propagating it to
   1276    other BGP speakers (if that attribute is not already present in the
   1277    received less specific route). A BGP speaker that receives a route
   1278    with the ATOMIC_AGGREGATE attribute shall not remove the attribute
   1279    from the route when propagating it to other speakers. A BGP speaker
   1280    that receives a route with the ATOMIC_AGGREGATE attribute shall not
   1281    make any NLRI of that route more specific (as defined in 9.1.4) when
   1282    advertising this route to other BGP speakers.  A BGP speaker that
   1283    receives a route with the ATOMIC_AGGREGATE attribute needs to be
   1284    cognizant of the fact that the actual path to destinations, as
   1285    specified in the NLRI of the route, while having the loop-free
   1286    property, may traverse ASs that are not listed in the AS_PATH
   1287 
   1288 
   1289 
   1290 Rekhter & Li                                                   [Page 23]
   1291 
   1292 RFC 1771                         BGP-4                        March 1995
   1293 
   1294 
   1295    attribute.
   1296 
   1297 5.1.7   AGGREGATOR
   1298 
   1299    AGGREGATOR is an optional transitive attribute which may be included
   1300    in updates which are formed by aggregation (see Section 9.2.4.2).  A
   1301    BGP speaker which performs route aggregation may add the AGGREGATOR
   1302    attribute which shall contain its own AS number and IP address.
   1303 
   1304 6.  BGP Error Handling.
   1305 
   1306    This section describes actions to be taken when errors are detected
   1307    while processing BGP messages.
   1308 
   1309    When any of the conditions described here are detected, a
   1310    NOTIFICATION message with the indicated Error Code, Error Subcode,
   1311    and Data fields is sent, and the BGP connection is closed.  If no
   1312    Error Subcode is specified, then a zero must be used.
   1313 
   1314    The phrase "the BGP connection is closed" means that the transport
   1315    protocol connection has been closed and that all resources for that
   1316    BGP connection have been deallocated.  Routing table entries
   1317    associated with the remote peer are marked as invalid.  The fact that
   1318    the routes have become invalid is passed to other BGP peers before
   1319    the routes are deleted from the system.
   1320 
   1321    Unless specified explicitly, the Data field of the NOTIFICATION
   1322    message that is sent to indicate an error is empty.
   1323 
   1324 6.1 Message Header error handling.
   1325 
   1326    All errors detected while processing the Message Header are indicated
   1327    by sending the NOTIFICATION message with Error Code Message Header
   1328    Error.  The Error Subcode elaborates on the specific nature of the
   1329    error.
   1330 
   1331    The expected value of the Marker field of the message header is all
   1332    ones if the message type is OPEN.  The expected value of the Marker
   1333    field for all other types of BGP messages determined based on the
   1334    presence of the Authentication Information Optional Parameter in the
   1335    BGP OPEN message and the actual authentication mechanism (if the
   1336    Authentication Information in the BGP OPEN message is present). If
   1337    the Marker field of the message header is not the expected one, then
   1338    a synchronization error has occurred and the Error Subcode is set to
   1339    Connection Not Synchronized.
   1340 
   1341 
   1342 
   1343 
   1344 
   1345 
   1346 Rekhter & Li                                                   [Page 24]
   1347 
   1348 RFC 1771                         BGP-4                        March 1995
   1349 
   1350 
   1351    If the Length field of the message header is less than 19 or greater
   1352    than 4096, or if the Length field of an OPEN message is less  than
   1353    the minimum length of the OPEN message, or if the Length field of an
   1354    UPDATE message is less than the minimum length of the UPDATE message,
   1355    or if the Length field of a KEEPALIVE message is not equal to 19, or
   1356    if the Length field of a NOTIFICATION message is less than the
   1357    minimum length of the NOTIFICATION message, then the Error Subcode is
   1358    set to Bad Message Length.  The Data field contains the erroneous
   1359    Length field.
   1360 
   1361    If the Type field of the message header is not recognized, then the
   1362    Error Subcode is set to Bad Message Type.  The Data field contains
   1363    the erroneous Type field.
   1364 
   1365 6.2 OPEN message error handling.
   1366 
   1367    All errors detected while processing the OPEN message are indicated
   1368    by sending the NOTIFICATION message with Error Code OPEN Message
   1369    Error.  The Error Subcode elaborates on the specific nature of the
   1370    error.
   1371 
   1372    If the version number contained in the Version field of the received
   1373    OPEN message is not supported, then the Error Subcode is set to
   1374    Unsupported Version Number.  The Data field is a 2-octet unsigned
   1375    integer, which indicates the largest locally supported version number
   1376    less than the version the remote BGP peer bid (as indicated in the
   1377    received OPEN message).
   1378 
   1379    If the Autonomous System field of the OPEN message is unacceptable,
   1380    then the Error Subcode is set to Bad Peer AS.  The determination of
   1381    acceptable Autonomous System numbers is outside the scope of this
   1382    protocol.
   1383 
   1384    If the Hold Time field of the OPEN message is unacceptable, then the
   1385    Error Subcode MUST be set to Unacceptable Hold Time.  An
   1386    implementation MUST reject Hold Time values of one or two seconds.
   1387    An implementation MAY reject any proposed Hold Time.  An
   1388    implementation which accepts a Hold Time MUST use the negotiated
   1389    value for the Hold Time.
   1390 
   1391    If the BGP Identifier field of the OPEN message is syntactically
   1392    incorrect, then the Error Subcode is set to Bad BGP Identifier.
   1393    Syntactic correctness means that the BGP Identifier field represents
   1394    a valid IP host address.
   1395 
   1396    If one of the Optional Parameters in the OPEN message is not
   1397    recognized, then the Error Subcode is set to Unsupported Optional
   1398    Parameters.
   1399 
   1400 
   1401 
   1402 Rekhter & Li                                                   [Page 25]
   1403 
   1404 RFC 1771                         BGP-4                        March 1995
   1405 
   1406 
   1407    If the OPEN message carries Authentication Information (as an
   1408    Optional Parameter), then the corresponding authentication procedure
   1409    is invoked.  If the authentication procedure (based on Authentication
   1410    Code and Authentication Data) fails, then the Error Subcode is set to
   1411    Authentication Failure.
   1412 
   1413 6.3 UPDATE message error handling.
   1414 
   1415    All errors detected while processing the UPDATE message are indicated
   1416    by sending the NOTIFICATION message with Error Code UPDATE Message
   1417    Error.  The error subcode elaborates on the specific nature of the
   1418    error.
   1419 
   1420    Error checking of an UPDATE message begins by examining the path
   1421    attributes.  If the Unfeasible Routes Length or Total Attribute
   1422    Length is too large (i.e., if Unfeasible Routes Length + Total
   1423    Attribute Length + 23 exceeds the message Length), then the Error
   1424    Subcode is set to Malformed Attribute List.
   1425 
   1426    If any recognized attribute has Attribute Flags that conflict with
   1427    the Attribute Type Code, then the Error Subcode is set to Attribute
   1428    Flags Error.  The Data field contains the erroneous attribute (type,
   1429    length and value).
   1430 
   1431    If any recognized attribute has Attribute Length that conflicts with
   1432    the expected length (based on the attribute type code), then the
   1433    Error Subcode is set to Attribute Length Error.  The Data field
   1434    contains the erroneous attribute (type, length and value).
   1435 
   1436    If any of the mandatory well-known attributes are not present, then
   1437    the Error Subcode is set to Missing Well-known Attribute.  The Data
   1438    field contains the Attribute Type Code of the missing well-known
   1439    attribute.
   1440 
   1441    If any of the mandatory well-known attributes are not recognized,
   1442    then the Error Subcode is set to Unrecognized Well-known Attribute.
   1443    The Data field contains the unrecognized attribute (type, length and
   1444    value).
   1445 
   1446    If the ORIGIN attribute has an undefined value, then the Error
   1447    Subcode is set to Invalid Origin Attribute.  The Data field contains
   1448    the unrecognized attribute (type, length and value).
   1449 
   1450    If the NEXT_HOP attribute field is syntactically incorrect, then the
   1451    Error Subcode is set to Invalid NEXT_HOP Attribute.  The Data field
   1452    contains the incorrect attribute (type, length and value).  Syntactic
   1453    correctness means that the NEXT_HOP attribute represents a valid IP
   1454    host address.  Semantic correctness applies only to the external BGP
   1455 
   1456 
   1457 
   1458 Rekhter & Li                                                   [Page 26]
   1459 
   1460 RFC 1771                         BGP-4                        March 1995
   1461 
   1462 
   1463    links. It means that the interface associated with the IP address, as
   1464    specified in the NEXT_HOP attribute, shares a common subnet with the
   1465    receiving BGP speaker and is not the IP address of the receiving BGP
   1466    speaker.  If the NEXT_HOP attribute is semantically incorrect, the
   1467    error should be logged, and the the route should be ignored.  In this
   1468    case, no NOTIFICATION message should be sent.
   1469 
   1470    The AS_PATH attribute is checked for syntactic correctness.  If the
   1471    path is syntactically incorrect, then the Error Subcode is set to
   1472    Malformed AS_PATH.
   1473 
   1474    If an optional attribute is recognized, then the value of this
   1475    attribute is checked.  If an error is detected, the attribute is
   1476    discarded, and the Error Subcode is set to Optional Attribute Error.
   1477    The Data field contains the attribute (type, length and value).
   1478 
   1479    If any attribute appears more than once in the UPDATE message, then
   1480    the Error Subcode is set to Malformed Attribute List.
   1481 
   1482    The NLRI field in the UPDATE message is checked for syntactic
   1483    validity.  If the field is syntactically incorrect, then the Error
   1484    Subcode is set to Invalid Network Field.
   1485 
   1486 6.4 NOTIFICATION message error handling.
   1487 
   1488    If a peer sends a NOTIFICATION message, and there is an error in that
   1489    message, there is unfortunately no means of reporting this error via
   1490    a subsequent NOTIFICATION message.  Any such error, such as an
   1491    unrecognized Error Code or Error Subcode, should be noticed, logged
   1492    locally, and brought to the attention of the administration of the
   1493    peer.  The means to do this, however, lies outside the scope of this
   1494    document.
   1495 
   1496 6.5 Hold Timer Expired error handling.
   1497 
   1498    If a system does not receive successive KEEPALIVE and/or UPDATE
   1499    and/or NOTIFICATION messages within the period specified in the Hold
   1500    Time field of the OPEN message, then the NOTIFICATION message with
   1501    Hold Timer Expired Error Code must be sent and the BGP connection
   1502    closed.
   1503 
   1504 6.6 Finite State Machine error handling.
   1505 
   1506    Any error detected by the BGP Finite State Machine (e.g., receipt of
   1507    an unexpected event) is indicated by sending the NOTIFICATION message
   1508    with Error Code Finite State Machine Error.
   1509 
   1510 
   1511 
   1512 
   1513 
   1514 Rekhter & Li                                                   [Page 27]
   1515 
   1516 RFC 1771                         BGP-4                        March 1995
   1517 
   1518 
   1519 6.7 Cease.
   1520 
   1521    In absence of any fatal errors (that are indicated in this section),
   1522    a BGP peer may choose at any given time to close its BGP connection
   1523    by sending the NOTIFICATION message with Error Code Cease.  However,
   1524    the Cease NOTIFICATION message must not be used when a fatal error
   1525    indicated by this section does exist.
   1526 
   1527 6.8 Connection collision detection.
   1528 
   1529    If a pair of BGP speakers try simultaneously to establish a TCP
   1530    connection to each other, then two parallel connections between this
   1531    pair of speakers might well be formed.  We refer to this situation as
   1532    connection collision.  Clearly, one of these connections must be
   1533    closed.
   1534 
   1535    Based on the value of the BGP Identifier a convention is established
   1536    for detecting which BGP connection is to be preserved when a
   1537    collision does occur. The convention is to compare the BGP
   1538    Identifiers of the peers involved in the collision and to retain only
   1539    the connection initiated by the BGP speaker with the higher-valued
   1540    BGP Identifier.
   1541 
   1542    Upon receipt of an OPEN message, the local system must examine all of
   1543    its connections that are in the OpenConfirm state.  A BGP speaker may
   1544    also examine connections in an OpenSent state if it knows the BGP
   1545    Identifier of the peer by means outside of the protocol.  If among
   1546    these connections there is a connection to a remote BGP speaker whose
   1547    BGP Identifier equals the one in the OPEN message, then the local
   1548    system performs the following collision resolution procedure:
   1549 
   1550       1. The BGP Identifier of the local system is compared to the BGP
   1551       Identifier of the remote system (as specified in the OPEN
   1552       message).
   1553 
   1554       2. If the value of the local BGP Identifier is less than the
   1555       remote one, the local system closes BGP connection that already
   1556       exists (the one that is already in the OpenConfirm state), and
   1557       accepts BGP connection initiated by the remote system.
   1558 
   1559       3. Otherwise, the local system closes newly created BGP connection
   1560       (the one associated with the newly received OPEN message), and
   1561       continues to use the existing one (the one that is already in the
   1562       OpenConfirm state).
   1563 
   1564       Comparing BGP Identifiers is done by treating them as (4-octet
   1565       long) unsigned integers.
   1566 
   1567 
   1568 
   1569 
   1570 Rekhter & Li                                                   [Page 28]
   1571 
   1572 RFC 1771                         BGP-4                        March 1995
   1573 
   1574 
   1575       A connection collision with an existing BGP connection that is in
   1576       Established states causes unconditional closing of the newly
   1577       created connection. Note that a connection collision cannot be
   1578       detected with connections that are in Idle, or Connect, or Active
   1579       states.
   1580 
   1581       Closing the BGP connection (that results from the collision
   1582       resolution procedure) is accomplished by sending the NOTIFICATION
   1583       message with the Error Code Cease.
   1584 
   1585 7.  BGP Version Negotiation.
   1586 
   1587    BGP speakers may negotiate the version of the protocol by making
   1588    multiple attempts to open a BGP connection, starting with the highest
   1589    version number each supports.  If an open attempt fails with an Error
   1590    Code OPEN Message Error, and an Error Subcode Unsupported Version
   1591    Number, then the BGP speaker has available the version number it
   1592    tried, the version number its peer tried, the version number passed
   1593    by its peer in the NOTIFICATION message, and the version numbers that
   1594    it supports.  If the two peers do support one or more common
   1595    versions, then this will allow them to rapidly determine the highest
   1596    common version. In order to support BGP version negotiation, future
   1597    versions of BGP must retain the format of the OPEN and NOTIFICATION
   1598    messages.
   1599 
   1600 8.  BGP Finite State machine.
   1601 
   1602    This section specifies BGP operation in terms of a Finite State
   1603    Machine (FSM).  Following is a brief summary and overview of BGP
   1604    operations by state as determined by this FSM.  A condensed version
   1605    of the BGP FSM is found in Appendix 1.
   1606 
   1607       Initially BGP is in the Idle state.
   1608 
   1609       Idle state:
   1610 
   1611          In this state BGP refuses all incoming BGP connections.  No
   1612          resources are allocated to the peer.  In response to the Start
   1613          event (initiated by either system or operator) the local system
   1614          initializes all BGP resources, starts the ConnectRetry timer,
   1615          initiates a transport connection to other BGP peer, while
   1616          listening for connection that may be initiated by the remote
   1617          BGP peer, and changes its state to Connect.  The exact value of
   1618          the ConnectRetry timer is a local matter, but should be
   1619          sufficiently large to allow TCP initialization.
   1620 
   1621          If a BGP speaker detects an error, it shuts down the connection
   1622          and changes its state to Idle. Getting out of the Idle state
   1623 
   1624 
   1625 
   1626 Rekhter & Li                                                   [Page 29]
   1627 
   1628 RFC 1771                         BGP-4                        March 1995
   1629 
   1630 
   1631          requires generation of the Start event.  If such an event is
   1632          generated automatically, then persistent BGP errors may result
   1633          in persistent flapping of the speaker.  To avoid such a
   1634          condition it is recommended that Start events should not be
   1635          generated immediately for a peer that was previously
   1636          transitioned to Idle due to an error. For a peer that was
   1637          previously transitioned to Idle due to an error, the time
   1638          between consecutive generation of Start events, if such events
   1639          are generated automatically, shall exponentially increase. The
   1640          value of the initial timer shall be 60 seconds. The time shall
   1641          be doubled for each consecutive retry.
   1642 
   1643          Any other event received in the Idle state is ignored.
   1644 
   1645       Connect state:
   1646 
   1647          In this state BGP is waiting for the transport protocol
   1648          connection to be completed.
   1649 
   1650          If the transport protocol connection succeeds, the local system
   1651          clears the ConnectRetry timer, completes initialization, sends
   1652          an OPEN message to its peer, and changes its state to OpenSent.
   1653 
   1654          If the transport protocol connect fails (e.g., retransmission
   1655          timeout), the local system restarts the ConnectRetry timer,
   1656          continues to listen for a connection that may be initiated by
   1657          the remote BGP peer, and changes its state to Active state.
   1658 
   1659          In response to the ConnectRetry timer expired event, the local
   1660          system restarts the ConnectRetry timer, initiates a transport
   1661          connection to other BGP peer, continues to listen for a
   1662          connection that may be initiated by the remote BGP peer, and
   1663          stays in the Connect state.
   1664 
   1665          Start event is ignored in the Active state.
   1666 
   1667          In response to any other event (initiated by either system or
   1668          operator), the local system releases all BGP resources
   1669          associated with this connection and changes its state to Idle.
   1670 
   1671       Active state:
   1672 
   1673          In this state BGP is trying to acquire a peer by initiating a
   1674          transport protocol connection.
   1675 
   1676          If the transport protocol connection succeeds, the local system
   1677          clears the ConnectRetry timer, completes initialization, sends
   1678          an OPEN message to its peer, sets its Hold Timer to a large
   1679 
   1680 
   1681 
   1682 Rekhter & Li                                                   [Page 30]
   1683 
   1684 RFC 1771                         BGP-4                        March 1995
   1685 
   1686 
   1687          value, and changes its state to OpenSent.  A Hold Timer value
   1688          of 4 minutes is suggested.
   1689 
   1690          In response to the ConnectRetry timer expired event, the local
   1691          system restarts the ConnectRetry timer, initiates a transport
   1692          connection to other BGP peer, continues to listen for a
   1693          connection that may be initiated by the remote BGP peer, and
   1694          changes its state to Connect.
   1695 
   1696          If the local system detects that a remote peer is trying to
   1697          establish BGP connection to it, and the IP address of the
   1698          remote peer is not an expected one, the local system restarts
   1699          the ConnectRetry timer, rejects the attempted connection,
   1700          continues to listen for a connection that may be initiated by
   1701          the remote BGP peer, and stays in the Active state.
   1702 
   1703          Start event is ignored in the Active state.
   1704 
   1705          In response to any other event (initiated by either system or
   1706          operator), the local system releases all BGP resources
   1707          associated with this connection and changes its state to Idle.
   1708 
   1709       OpenSent state:
   1710 
   1711          In this state BGP waits for an OPEN message from its peer.
   1712          When an OPEN message is received, all fields are checked for
   1713          correctness.  If the BGP message header checking or OPEN
   1714          message checking detects an error (see Section 6.2), or a
   1715          connection collision (see Section 6.8) the local system sends a
   1716          NOTIFICATION message and changes its state to Idle.
   1717 
   1718          If there are no errors in the OPEN message, BGP sends a
   1719          KEEPALIVE message and sets a KeepAlive timer.  The Hold Timer,
   1720          which was originally set to a large value (see above), is
   1721          replaced with the negotiated Hold Time value (see section 4.2).
   1722          If the negotiated Hold Time value is zero, then the Hold Time
   1723          timer and KeepAlive timers are not started.  If the value of
   1724          the Autonomous System field is the same as the local Autonomous
   1725          System number, then the connection is an "internal" connection;
   1726          otherwise, it is "external".  (This will effect UPDATE
   1727          processing as described below.)  Finally, the state is changed
   1728          to OpenConfirm.
   1729 
   1730          If a disconnect notification is received from the underlying
   1731          transport protocol, the local system closes the BGP connection,
   1732          restarts the ConnectRetry timer, while continue listening for
   1733          connection that may be initiated by the remote BGP peer, and
   1734          goes into the Active state.
   1735 
   1736 
   1737 
   1738 Rekhter & Li                                                   [Page 31]
   1739 
   1740 RFC 1771                         BGP-4                        March 1995
   1741 
   1742 
   1743          If the Hold Timer expires, the local system sends NOTIFICATION
   1744          message with error code Hold Timer Expired and changes its
   1745          state to Idle.
   1746 
   1747          In response to the Stop event (initiated by either system or
   1748          operator) the local system sends NOTIFICATION message with
   1749          Error Code Cease and changes its state to Idle.
   1750 
   1751          Start event is ignored in the OpenSent state.
   1752 
   1753          In response to any other event the local system sends
   1754          NOTIFICATION message with Error Code Finite State Machine Error
   1755          and changes its state to Idle.
   1756 
   1757          Whenever BGP changes its state from OpenSent to Idle, it closes
   1758          the BGP (and transport-level) connection and releases all
   1759          resources associated with that connection.
   1760 
   1761       OpenConfirm state:
   1762 
   1763          In this state BGP waits for a KEEPALIVE or NOTIFICATION
   1764          message.
   1765 
   1766          If the local system receives a KEEPALIVE message, it changes
   1767          its state to Established.
   1768 
   1769          If the Hold Timer expires before a KEEPALIVE message is
   1770          received, the local system sends NOTIFICATION message with
   1771          error code Hold Timer Expired and changes its state to Idle.
   1772 
   1773          If the local system receives a NOTIFICATION message, it changes
   1774          its state to Idle.
   1775 
   1776          If the KeepAlive timer expires, the local system sends a
   1777          KEEPALIVE message and restarts its KeepAlive timer.
   1778 
   1779          If a disconnect notification is received from the underlying
   1780          transport protocol, the local system changes its state to Idle.
   1781 
   1782          In response to the Stop event (initiated by either system or
   1783          operator) the local system sends NOTIFICATION message with
   1784          Error Code Cease and changes its state to Idle.
   1785 
   1786          Start event is ignored in the OpenConfirm state.
   1787 
   1788          In response to any other event the local system sends
   1789          NOTIFICATION message with Error Code Finite State Machine Error
   1790          and changes its state to Idle.
   1791 
   1792 
   1793 
   1794 Rekhter & Li                                                   [Page 32]
   1795 
   1796 RFC 1771                         BGP-4                        March 1995
   1797 
   1798 
   1799          Whenever BGP changes its state from OpenConfirm to Idle, it
   1800          closes the BGP (and transport-level) connection and releases
   1801          all resources associated with that connection.
   1802 
   1803       Established state:
   1804 
   1805          In the Established state BGP can exchange UPDATE, NOTIFICATION,
   1806          and KEEPALIVE messages with its peer.
   1807 
   1808          If the local system receives an UPDATE or KEEPALIVE message, it
   1809          restarts its Hold Timer, if the negotiated Hold Time value is
   1810          non-zero.
   1811 
   1812          If the local system receives a NOTIFICATION message, it changes
   1813          its state to Idle.
   1814 
   1815          If the local system receives an UPDATE message and the UPDATE
   1816          message error handling procedure (see Section 6.3) detects an
   1817          error, the local system sends a NOTIFICATION message and
   1818          changes its state to Idle.
   1819 
   1820          If a disconnect notification is received from the underlying
   1821          transport protocol, the local system changes its state to Idle.
   1822 
   1823          If the Hold Timer expires, the local system sends a
   1824          NOTIFICATION message with Error Code Hold Timer Expired and
   1825          changes its state to Idle.
   1826 
   1827          If the KeepAlive timer expires, the local system sends a
   1828          KEEPALIVE message and restarts its KeepAlive timer.
   1829 
   1830          Each time the local system sends a KEEPALIVE or UPDATE message,
   1831          it restarts its KeepAlive timer, unless the negotiated Hold
   1832          Time value is zero.
   1833 
   1834          In response to the Stop event (initiated by either system or
   1835          operator), the local system sends a NOTIFICATION message with
   1836          Error Code Cease and changes its state to Idle.
   1837 
   1838          Start event is ignored in the Established state.
   1839 
   1840          In response to any other event, the local system sends
   1841          NOTIFICATION message with Error Code Finite State Machine Error
   1842          and changes its state to Idle.
   1843 
   1844          Whenever BGP changes its state from Established to Idle, it
   1845          closes the BGP (and transport-level) connection, releases all
   1846          resources associated with that connection, and deletes all
   1847 
   1848 
   1849 
   1850 Rekhter & Li                                                   [Page 33]
   1851 
   1852 RFC 1771                         BGP-4                        March 1995
   1853 
   1854 
   1855          routes derived from that connection.
   1856 
   1857 9.  UPDATE Message Handling
   1858 
   1859    An UPDATE message may be received only in the Established state.
   1860    When an UPDATE message is received, each field is checked for
   1861    validity as specified in Section 6.3.
   1862 
   1863    If an optional non-transitive attribute is unrecognized, it is
   1864    quietly ignored.  If an optional transitive attribute is
   1865    unrecognized, the Partial bit (the third high-order bit) in the
   1866    attribute flags octet is set to 1, and the attribute is retained for
   1867    propagation to other BGP speakers.
   1868 
   1869    If an optional attribute is recognized, and has a valid value, then,
   1870    depending on the type of the optional attribute, it is processed
   1871    locally, retained, and updated, if necessary, for possible
   1872    propagation to other BGP speakers.
   1873 
   1874    If the UPDATE message contains a non-empty WITHDRAWN ROUTES field,
   1875    the previously advertised routes whose destinations (expressed as IP
   1876    prefixes) contained in this field shall be removed from the Adj-RIB-
   1877    In.  This BGP speaker shall run its Decision Process since the
   1878    previously advertised route is not longer available for use.
   1879 
   1880    If the UPDATE message contains a feasible route, it shall be placed
   1881    in the appropriate Adj-RIB-In, and the following additional actions
   1882    shall be taken:
   1883 
   1884    i) If its Network Layer Reachability Information (NLRI) is identical
   1885    to the one of a route currently stored in the Adj-RIB-In, then the
   1886    new route shall replace the older route in the Adj-RIB-In, thus
   1887    implicitly withdrawing the older route from service. The BGP speaker
   1888    shall run its Decision Process since the older route is no longer
   1889    available for use.
   1890 
   1891    ii) If the new route is an overlapping route that is included (see
   1892    9.1.4) in an earlier route contained in the Adj-RIB-In, the BGP
   1893    speaker shall run its Decision Process since the more specific route
   1894    has implicitly made a portion of the less specific route unavailable
   1895    for use.
   1896 
   1897    iii) If the new route has identical path attributes to an earlier
   1898    route contained in the Adj-RIB-In, and is more specific (see 9.1.4)
   1899    than the earlier route, no further actions are necessary.
   1900 
   1901    iv) If the new route has NLRI that is not present in any of the
   1902    routes currently stored in the Adj-RIB-In, then the new route shall
   1903 
   1904 
   1905 
   1906 Rekhter & Li                                                   [Page 34]
   1907 
   1908 RFC 1771                         BGP-4                        March 1995
   1909 
   1910 
   1911    be placed in the Adj-RIB-In. The BGP speaker shall run its Decision
   1912    Process.
   1913 
   1914    v) If the new route is an overlapping route that is less specific
   1915    (see 9.1.4) than an earlier route contained in the Adj-RIB-In, the
   1916    BGP speaker shall run its Decision Process on the set of destinations
   1917    described only by the less specific route.
   1918 
   1919 9.1 Decision Process
   1920 
   1921    The Decision Process selects routes for subsequent advertisement by
   1922    applying the policies in the local Policy Information Base (PIB) to
   1923    the routes stored in its Adj-RIB-In. The output of the Decision
   1924    Process is the set of routes that will be advertised to all peers;
   1925    the selected routes will be stored in the local speaker's Adj-RIB-
   1926    Out.
   1927 
   1928    The selection process is formalized by defining a function that takes
   1929    the attribute of a given route as an argument and returns a non-
   1930    negative integer denoting the degree of preference for the route.
   1931    The function that calculates the degree of preference for a given
   1932    route shall not use as its inputs any of the following:  the
   1933    existence of other routes, the non-existence of other routes, or the
   1934    path attributes of other routes. Route selection then consists of
   1935    individual application of the degree of preference function to each
   1936    feasible route, followed by the choice of the one with the highest
   1937    degree of preference.
   1938 
   1939    The Decision Process operates on routes contained in each Adj-RIB-In,
   1940    and is responsible for:
   1941 
   1942       - selection of routes to be advertised to BGP speakers located in
   1943       the local speaker's autonomous system
   1944 
   1945       - selection of routes to be advertised to BGP speakers located in
   1946       neighboring autonomous systems
   1947 
   1948       - route aggregation and route information reduction
   1949 
   1950    The Decision Process takes place in three distinct phases, each
   1951    triggered by a different event:
   1952 
   1953       a) Phase 1 is responsible for calculating the degree of preference
   1954       for each route received from a BGP speaker located in a
   1955       neighboring autonomous system, and for advertising to the other
   1956       BGP speakers in the local autonomous system the routes that have
   1957       the highest degree of preference for each distinct destination.
   1958 
   1959 
   1960 
   1961 
   1962 Rekhter & Li                                                   [Page 35]
   1963 
   1964 RFC 1771                         BGP-4                        March 1995
   1965 
   1966 
   1967       b) Phase 2 is invoked on completion of phase 1. It is responsible
   1968       for choosing the best route out of all those available for each
   1969       distinct destination, and for installing each chosen route into
   1970       the appropriate Loc-RIB.
   1971 
   1972       c) Phase 3 is invoked after the Loc-RIB has been modified. It is
   1973       responsible for disseminating routes in the Loc-RIB to each peer
   1974       located in a neighboring autonomous system, according to the
   1975       policies contained in the PIB. Route aggregation and information
   1976       reduction can optionally be performed within this phase.
   1977 
   1978 9.1.1 Phase 1: Calculation of Degree of Preference
   1979 
   1980    The Phase 1 decision function shall be invoked whenever the local BGP
   1981    speaker receives an UPDATE message from a peer located in a
   1982    neighboring autonomous system that advertises a new route, a
   1983    replacement route, or a withdrawn route.
   1984 
   1985    The Phase 1 decision function is a separate process which completes
   1986    when it has no further work to do.
   1987 
   1988    The Phase 1 decision function shall lock an Adj-RIB-In prior to
   1989    operating on any route contained within it, and shall unlock it after
   1990    operating on all new or unfeasible routes contained within it.
   1991 
   1992    For each newly received or replacement feasible route, the local BGP
   1993    speaker shall determine a degree of preference. If the route is
   1994    learned from a BGP speaker in the local autonomous system, either the
   1995    value of the LOCAL_PREF attribute shall be taken as the degree of
   1996    preference, or the local system shall compute the degree of
   1997    preference of the route based on preconfigured policy information. If
   1998    the route is learned from a BGP speaker in a neighboring autonomous
   1999    system, then the degree of preference shall be computed based on
   2000    preconfigured policy information.  The exact nature of this policy
   2001    information and the computation involved is a local matter.  The
   2002    local speaker shall then run the internal update process of 9.2.1 to
   2003    select and advertise the most preferable route.
   2004 
   2005 9.1.2 Phase 2: Route Selection
   2006 
   2007    The Phase 2 decision function shall be invoked on completion of Phase
   2008    1.  The Phase 2 function is a separate process which completes when
   2009    it has no further work to do. The Phase 2 process shall consider all
   2010    routes that are present in the Adj-RIBs-In, including those received
   2011    from BGP speakers located in its own autonomous system and those
   2012    received from BGP speakers located in neighboring autonomous systems.
   2013 
   2014 
   2015 
   2016 
   2017 
   2018 Rekhter & Li                                                   [Page 36]
   2019 
   2020 RFC 1771                         BGP-4                        March 1995
   2021 
   2022 
   2023    The Phase 2 decision function shall be blocked from running while the
   2024    Phase 3 decision function is in process. The Phase 2 function shall
   2025    lock all Adj-RIBs-In prior to commencing its function, and shall
   2026    unlock them on completion.
   2027 
   2028    If the NEXT_HOP attribute of a BGP route depicts an address to which
   2029    the local BGP speaker doesn't have a route in its Loc-RIB, the BGP
   2030    route SHOULD be excluded from the Phase 2 decision function.
   2031 
   2032    For each set of destinations for which a feasible route exists in the
   2033    Adj-RIBs-In, the local BGP speaker shall identify the route that has:
   2034 
   2035       a) the highest degree of preference of any route to the same set
   2036       of destinations, or
   2037 
   2038       b) is the only route to that destination, or
   2039 
   2040       c) is selected as a result of the Phase 2 tie breaking rules
   2041       specified in 9.1.2.1.
   2042 
   2043    The local speaker SHALL then install that route in the Loc-RIB,
   2044    replacing any route to the same destination that is currently being
   2045    held in the Loc-RIB. The local speaker MUST determine the immediate
   2046    next hop to the address depicted by the NEXT_HOP attribute of the
   2047    selected route by performing a lookup in the IGP and selecting one of
   2048    the possible paths in the IGP.  This immediate next hop MUST be used
   2049    when installing the selected route in the Loc-RIB.  If the route to
   2050    the address depicted by the NEXT_HOP attribute changes such that the
   2051    immediate next hop changes, route selection should be recalculated as
   2052    specified above.
   2053 
   2054    Unfeasible routes shall be removed from the Loc-RIB, and
   2055    corresponding unfeasible routes shall then be removed from the Adj-
   2056    RIBs-In.
   2057 
   2058 9.1.2.1 Breaking Ties (Phase 2)
   2059 
   2060    In its Adj-RIBs-In a BGP speaker may have several routes to the same
   2061    destination that have the same degree of preference. The local
   2062    speaker can select only one of these routes for inclusion in the
   2063    associated Loc-RIB. The local speaker considers all equally
   2064    preferable routes, both those received from BGP speakers located in
   2065    neighboring autonomous systems, and those received from other BGP
   2066    speakers located in the local speaker's autonomous system.
   2067 
   2068    The following tie-breaking procedure assumes that for each candidate
   2069    route all the BGP speakers within an autonomous system can ascertain
   2070    the cost of a path (interior distance) to the address depicted by the
   2071 
   2072 
   2073 
   2074 Rekhter & Li                                                   [Page 37]
   2075 
   2076 RFC 1771                         BGP-4                        March 1995
   2077 
   2078 
   2079    NEXT_HOP attribute of the route.  Ties shall be broken according to
   2080    the following algorithm:
   2081 
   2082       a) If the local system is configured to take into account
   2083       MULTI_EXIT_DISC, and the candidate routes differ in their
   2084       MULTI_EXIT_DISC attribute, select the route that has the lowest
   2085       value of the MULTI_EXIT_DISC attribute.
   2086 
   2087       b) Otherwise, select the route that has the lowest cost (interior
   2088       distance) to the entity depicted by the NEXT_HOP attribute of the
   2089       route.  If there are several routes with the same cost, then the
   2090       tie-breaking shall be broken as follows:
   2091 
   2092          - if at least one of the candidate routes was advertised by the
   2093          BGP speaker in a neighboring autonomous system, select the
   2094          route that was advertised by the BGP speaker in a neighboring
   2095          autonomous system whose BGP Identifier has the lowest value
   2096          among all other BGP speakers in neighboring autonomous systems;
   2097 
   2098          - otherwise, select the route that was advertised by the BGP
   2099          speaker whose BGP Identifier has the lowest value.
   2100 
   2101 9.1.3   Phase 3: Route Dissemination
   2102 
   2103    The Phase 3 decision function shall be invoked on completion of Phase
   2104    2, or when any of the following events occur:
   2105 
   2106       a) when routes in a Loc-RIB to local destinations have changed
   2107 
   2108       b) when locally generated routes learned by means outside of BGP
   2109       have changed
   2110 
   2111       c) when a new BGP speaker - BGP speaker connection has been
   2112       established
   2113 
   2114    The Phase 3 function is a separate process which completes when it
   2115    has no further work to do. The Phase 3 Routing Decision function
   2116    shall be blocked from running while the Phase 2 decision function is
   2117    in process.
   2118 
   2119    All routes in the Loc-RIB shall be processed into a corresponding
   2120    entry in the associated Adj-RIBs-Out. Route aggregation and
   2121    information reduction techniques (see 9.2.4.1) may optionally be
   2122    applied.
   2123 
   2124    For the benefit of future support of inter-AS multicast capabilities,
   2125    a BGP speaker that participates in inter-AS multicast routing shall
   2126    advertise a route it receives from one of its external peers and if
   2127 
   2128 
   2129 
   2130 Rekhter & Li                                                   [Page 38]
   2131 
   2132 RFC 1771                         BGP-4                        March 1995
   2133 
   2134 
   2135    it installs it in its Loc-RIB, it shall advertise it back to the peer
   2136    from which the route was received. For a BGP speaker that does not
   2137    participate in inter-AS multicast routing such an advertisement is
   2138    optional. When doing such an advertisement, the NEXT_HOP attribute
   2139    should be set to the address of the peer. An implementation may also
   2140    optimize such an advertisement by truncating information in the
   2141    AS_PATH attribute to include only its own AS number and that of the
   2142    peer that advertised the route (such truncation requires the ORIGIN
   2143    attribute to be set to INCOMPLETE).  In addition an implementation is
   2144    not required to pass optional or discretionary path attributes with
   2145    such an advertisement.
   2146 
   2147    When the updating of the Adj-RIBs-Out and the Forwarding Information
   2148    Base (FIB) is complete, the local BGP speaker shall run the external
   2149    update process of 9.2.2.
   2150 
   2151 9.1.4 Overlapping Routes
   2152 
   2153    A BGP speaker may transmit routes with overlapping Network Layer
   2154    Reachability Information (NLRI) to another BGP speaker. NLRI overlap
   2155    occurs when a set of destinations are identified in non-matching
   2156    multiple routes. Since BGP encodes NLRI using IP prefixes, overlap
   2157    will always exhibit subset relationships.  A route describing a
   2158    smaller set of destinations (a longer prefix) is said to be more
   2159    specific than a route describing a larger set of destinations (a
   2160    shorted prefix); similarly, a route describing a larger set of
   2161    destinations (a shorter prefix) is said to be less specific than a
   2162    route describing a smaller set of destinations (a longer prefix).
   2163 
   2164    The precedence relationship effectively decomposes less specific
   2165    routes into two parts:
   2166 
   2167       -  a set of destinations described only by the less specific
   2168       route, and
   2169 
   2170       -  a set of destinations described by the overlap of the less
   2171       specific and the more specific routes
   2172 
   2173    When overlapping routes are present in the same Adj-RIB-In, the more
   2174    specific route shall take precedence, in order from more specific to
   2175    least specific.
   2176 
   2177    The set of destinations described by the overlap represents a portion
   2178    of the less specific route that is feasible, but is not currently in
   2179    use.  If a more specific route is later withdrawn, the set of
   2180    destinations described by the overlap will still be reachable using
   2181    the less specific route.
   2182 
   2183 
   2184 
   2185 
   2186 Rekhter & Li                                                   [Page 39]
   2187 
   2188 RFC 1771                         BGP-4                        March 1995
   2189 
   2190 
   2191    If a BGP speaker receives overlapping routes, the Decision Process
   2192    shall take into account the semantics of the overlapping routes. In
   2193    particular, if a BGP speaker accepts the less specific route while
   2194    rejecting the more specific route from the same peer, then the
   2195    destinations represented by the overlap may not forward along the ASs
   2196    listed in the AS_PATH attribute of that route. Therefore, a BGP
   2197    speaker has the following choices:
   2198 
   2199       a)   Install both the less and the more specific routes
   2200 
   2201       b)   Install the more specific route only
   2202 
   2203       c)   Install the non-overlapping part of the less specific
   2204                  route only (that implies de-aggregation)
   2205 
   2206       d)   Aggregate the two routes and install the aggregated route
   2207 
   2208       e)   Install the less specific route only
   2209 
   2210       f)   Install neither route
   2211 
   2212    If a BGP speaker chooses e), then it should add ATOMIC_AGGREGATE
   2213    attribute to the route. A route that carries ATOMIC_AGGREGATE
   2214    attribute can not be de-aggregated. That is, the NLRI of this route
   2215    can not be made more specific.  Forwarding along such a route does
   2216    not guarantee that IP packets will actually traverse only ASs listed
   2217    in the AS_PATH attribute of the route.  If a BGP speaker chooses a),
   2218    it must not advertise the more general route without the more
   2219    specific route.
   2220 
   2221 9.2 Update-Send Process
   2222 
   2223    The Update-Send process is responsible for advertising UPDATE
   2224    messages to all peers. For example, it distributes the routes chosen
   2225    by the Decision Process to other BGP speakers which may be located in
   2226    either the same autonomous system or a neighboring autonomous system.
   2227    rules for information exchange between BGP speakers located in
   2228    different autonomous systems are given in 9.2.2; rules for
   2229    information exchange between BGP speakers located in the same
   2230    autonomous system are given in 9.2.1.
   2231 
   2232    Distribution of routing information between a set of BGP speakers,
   2233    all of which are located in the same autonomous system, is referred
   2234    to as internal distribution.
   2235 
   2236 
   2237 
   2238 
   2239 
   2240 
   2241 
   2242 Rekhter & Li                                                   [Page 40]
   2243 
   2244 RFC 1771                         BGP-4                        March 1995
   2245 
   2246 
   2247 9.2.1 Internal Updates
   2248 
   2249    The Internal update process is concerned with the distribution of
   2250    routing information to BGP speakers located in the local speaker's
   2251    autonomous system.
   2252 
   2253    When a BGP speaker receives an UPDATE message from another BGP
   2254    speaker located in its own autonomous system, the receiving BGP
   2255    speaker shall not re-distribute the routing information contained in
   2256    that UPDATE message to other BGP speakers located in its own
   2257    autonomous system.
   2258 
   2259    When a BGP speaker receives a new route from a BGP speaker in a
   2260    neighboring autonomous system, it shall advertise that route to all
   2261    other BGP speakers in its autonomous system by means of an UPDATE
   2262    message if any of the following conditions occur:
   2263 
   2264       1) the degree of preference assigned to the newly received route
   2265       by the local BGP speaker is higher than the degree of preference
   2266       that the local speaker has assigned to other routes that have been
   2267       received from BGP speakers in neighboring autonomous systems, or
   2268 
   2269       2) there are no other routes that have been received from BGP
   2270       speakers in neighboring autonomous systems, or
   2271 
   2272       3) the newly received route is selected as a result of breaking a
   2273       tie between several routes which have the highest degree of
   2274       preference, and the same destination (the tie-breaking procedure
   2275       is specified in 9.2.1.1).
   2276 
   2277    When a BGP speaker receives an UPDATE message with a non-empty
   2278    WITHDRAWN ROUTES field, it shall remove from its Adj-RIB-In all
   2279    routes whose destinations was carried in this field (as IP prefixes).
   2280    The speaker shall take the following additional steps:
   2281 
   2282       1) if the corresponding feasible route had not been previously
   2283       advertised, then no further action is necessary
   2284 
   2285       2) if the corresponding feasible route had been previously
   2286       advertised, then:
   2287 
   2288          i) if a new route is selected for advertisement that has the
   2289          same Network Layer Reachability Information as the unfeasible
   2290          routes, then the local BGP speaker shall advertise the
   2291          replacement route
   2292 
   2293          ii) if a replacement route is not available for advertisement,
   2294          then the BGP speaker shall include the destinations  of the
   2295 
   2296 
   2297 
   2298 Rekhter & Li                                                   [Page 41]
   2299 
   2300 RFC 1771                         BGP-4                        March 1995
   2301 
   2302 
   2303          unfeasible route (in form of IP prefixes) in the WITHDRAWN
   2304          ROUTES field of an UPDATE message, and shall send this message
   2305          to each peer to whom it had previously advertised the
   2306          corresponding feasible route.
   2307 
   2308    All feasible routes which are advertised shall be placed in the
   2309    appropriate Adj-RIBs-Out, and all unfeasible routes which are
   2310    advertised shall be removed from the Adj-RIBs-Out.
   2311 
   2312 9.2.1.1 Breaking Ties (Internal Updates)
   2313 
   2314    If a local BGP speaker has connections to several BGP speakers in
   2315    neighboring autonomous systems, there will be multiple Adj-RIBs-In
   2316    associated with these peers. These Adj-RIBs-In might contain several
   2317    equally preferable routes to the same destination, all of which were
   2318    advertised by BGP speakers located in neighboring autonomous systems.
   2319    The local BGP speaker shall select one of these routes according to
   2320    the following rules:
   2321 
   2322       a) If the candidate route differ only in their NEXT_HOP and
   2323       MULTI_EXIT_DISC attributes, and the local system is configured to
   2324       take into account MULTI_EXIT_DISC attribute, select the routes
   2325       that has the lowest value of the MULTI_EXIT_DISC attribute.
   2326 
   2327       b) If the local system can ascertain the cost of a path to the
   2328       entity depicted by the NEXT_HOP attribute of the candidate route,
   2329       select the route with the lowest cost.
   2330 
   2331       c) In all other cases, select the route that was advertised by the
   2332       BGP speaker whose BGP Identifier has the lowest value.
   2333 
   2334 9.2.2 External Updates
   2335 
   2336    The external update process is concerned with the distribution of
   2337    routing information to BGP speakers located in neighboring autonomous
   2338    systems. As part of Phase 3 route selection process, the BGP speaker
   2339    has updated its Adj-RIBs-Out and its Forwarding Table. All newly
   2340    installed routes and all newly unfeasible routes for which there is
   2341    no replacement route shall be advertised to BGP speakers located in
   2342    neighboring autonomous systems by means of UPDATE message.
   2343 
   2344    Any routes in the Loc-RIB marked as unfeasible shall be removed.
   2345    Changes to the reachable destinations within its own autonomous
   2346    system shall also be advertised in an UPDATE message.
   2347 
   2348 
   2349 
   2350 
   2351 
   2352 
   2353 
   2354 Rekhter & Li                                                   [Page 42]
   2355 
   2356 RFC 1771                         BGP-4                        March 1995
   2357 
   2358 
   2359 9.2.3 Controlling Routing Traffic Overhead
   2360 
   2361    The BGP protocol constrains the amount of routing traffic (that is,
   2362    UPDATE messages) in order to limit both the link bandwidth needed to
   2363    advertise UPDATE messages and the processing power needed by the
   2364    Decision Process to digest the information contained in the UPDATE
   2365    messages.
   2366 
   2367 9.2.3.1 Frequency of Route Advertisement
   2368 
   2369    The parameter MinRouteAdvertisementInterval determines the minimum
   2370    amount of time that must elapse between advertisement of routes to a
   2371    particular destination from a single BGP speaker. This rate limiting
   2372    procedure applies on a per-destination basis, although the value of
   2373    MinRouteAdvertisementInterval is set on a per BGP peer basis.
   2374 
   2375    Two UPDATE messages sent from a single BGP speaker that advertise
   2376    feasible routes to some common set of destinations received from BGP
   2377    speakers in neighboring autonomous systems must be separated by at
   2378    least MinRouteAdvertisementInterval. Clearly, this can only be
   2379    achieved precisely by keeping a separate timer for each common set of
   2380    destinations. This would be unwarranted overhead. Any technique which
   2381    ensures that the interval between two UPDATE messages sent from a
   2382    single BGP speaker that advertise feasible routes to some common set
   2383    of destinations received from BGP speakers in neighboring autonomous
   2384    systems will be at least MinRouteAdvertisementInterval, and will also
   2385    ensure a constant upper bound on the interval is acceptable.
   2386 
   2387    Since fast convergence is needed within an autonomous system, this
   2388    procedure does not apply for routes receives from other BGP speakers
   2389    in the same autonomous system. To avoid long-lived black holes, the
   2390    procedure does not apply to the explicit withdrawal of unfeasible
   2391    routes (that is, routes whose destinations (expressed as IP prefixes)
   2392    are listed in the WITHDRAWN ROUTES field of an UPDATE message).
   2393 
   2394    This procedure does not limit the rate of route selection, but only
   2395    the rate of route advertisement. If new routes are selected multiple
   2396    times while awaiting the expiration of MinRouteAdvertisementInterval,
   2397    the last route selected shall be advertised at the end of
   2398    MinRouteAdvertisementInterval.
   2399 
   2400 9.2.3.2 Frequency of Route Origination
   2401 
   2402    The parameter MinASOriginationInterval determines the minimum amount
   2403    of time that must elapse between successive advertisements of UPDATE
   2404    messages that report changes within the advertising BGP speaker's own
   2405    autonomous systems.
   2406 
   2407 
   2408 
   2409 
   2410 Rekhter & Li                                                   [Page 43]
   2411 
   2412 RFC 1771                         BGP-4                        March 1995
   2413 
   2414 
   2415 9.2.3.3 Jitter
   2416 
   2417    To minimize the likelihood that the distribution of BGP messages by a
   2418    given BGP speaker will contain peaks, jitter should be applied to the
   2419    timers associated with MinASOriginationInterval, Keepalive, and
   2420    MinRouteAdvertisementInterval. A given BGP speaker shall apply the
   2421    same jitter to each of these quantities regardless of the
   2422    destinations to which the updates are being sent; that is, jitter
   2423    will not be applied on a "per peer" basis.
   2424 
   2425    The amount of jitter to be introduced shall be determined by
   2426    multiplying the base value of the appropriate timer by a random
   2427    factor which is uniformly distributed in the range from 0.75 to 1.0.
   2428 
   2429 9.2.4 Efficient Organization of Routing Information
   2430 
   2431    Having selected the routing information which it will advertise, a
   2432    BGP speaker may avail itself of several methods to organize this
   2433    information in an efficient manner.
   2434 
   2435 9.2.4.1 Information Reduction
   2436 
   2437    Information reduction may imply a reduction in granularity of policy
   2438    control - after information is collapsed, the same policies will
   2439    apply to all destinations and paths in the equivalence class.
   2440 
   2441    The Decision Process may optionally reduce the amount of information
   2442    that it will place in the Adj-RIBs-Out by any of the following
   2443    methods:
   2444 
   2445       a)   Network Layer Reachability Information (NLRI):
   2446 
   2447       Destination IP addresses can be represented as IP address
   2448       prefixes.  In cases where there is a correspondence between the
   2449       address structure and the systems under control of an autonomous
   2450       system administrator, it will be possible to reduce the size of
   2451       the NLRI carried in the UPDATE messages.
   2452 
   2453       b)   AS_PATHs:
   2454 
   2455       AS path information can be represented as ordered AS_SEQUENCEs or
   2456       unordered AS_SETs. AS_SETs are used in the route aggregation
   2457       algorithm described in 9.2.4.2. They reduce the size of the
   2458       AS_PATH information by listing each AS number only once,
   2459       regardless of how many times it may have appeared in multiple
   2460       AS_PATHs that were aggregated.
   2461 
   2462 
   2463 
   2464 
   2465 
   2466 Rekhter & Li                                                   [Page 44]
   2467 
   2468 RFC 1771                         BGP-4                        March 1995
   2469 
   2470 
   2471       An AS_SET implies that the destinations listed in the NLRI can be
   2472       reached through paths that traverse at least some of the
   2473       constituent autonomous systems. AS_SETs provide sufficient
   2474       information to avoid routing information looping; however their
   2475       use may prune potentially feasible paths, since such paths are no
   2476       longer listed individually as in the form of AS_SEQUENCEs.  In
   2477       practice this is not likely to be a problem, since once an IP
   2478       packet arrives at the edge of a group of autonomous systems, the
   2479       BGP speaker at that point is likely to have more detailed path
   2480       information and can distinguish individual paths to destinations.
   2481 
   2482 9.2.4.2 Aggregating Routing Information
   2483 
   2484    Aggregation is the process of combining the characteristics of
   2485    several different routes in such a way that a single route can be
   2486    advertised.  Aggregation can occur as part of the decision  process
   2487    to reduce the amount of routing information that will be placed in
   2488    the Adj-RIBs-Out.
   2489 
   2490    Aggregation reduces the amount of information that a BGP speaker must
   2491    store and exchange with other BGP speakers. Routes can be aggregated
   2492    by applying the following procedure separately to path attributes of
   2493    like type and to the Network Layer Reachability Information.
   2494 
   2495    Routes that have the following attributes shall not be aggregated
   2496    unless the corresponding attributes of each route are identical:
   2497    MULTI_EXIT_DISC, NEXT_HOP.
   2498 
   2499    Path attributes that have different type codes can not be aggregated
   2500    together. Path of the same type code may be aggregated, according to
   2501    the following rules:
   2502 
   2503       ORIGIN attribute: If at least one route among routes that are
   2504       aggregated has ORIGIN with the value INCOMPLETE, then the
   2505       aggregated route must have the ORIGIN attribute with the value
   2506       INCOMPLETE. Otherwise, if at least one route among routes that are
   2507       aggregated has ORIGIN with the value EGP, then the aggregated
   2508       route must have the origin attribute with the value EGP. In all
   2509       other case the value of the ORIGIN attribute of the aggregated
   2510       route is INTERNAL.
   2511 
   2512       AS_PATH attribute: If routes to be aggregated have identical
   2513       AS_PATH attributes, then the aggregated route has the same AS_PATH
   2514       attribute as each individual route.
   2515 
   2516       For the purpose of aggregating AS_PATH attributes we model each AS
   2517       within the AS_PATH attribute as a tuple <type, value>, where
   2518       "type" identifies a type of the path segment the AS belongs to
   2519 
   2520 
   2521 
   2522 Rekhter & Li                                                   [Page 45]
   2523 
   2524 RFC 1771                         BGP-4                        March 1995
   2525 
   2526 
   2527       (e.g. AS_SEQUENCE, AS_SET), and "value" is the AS number.  If the
   2528       routes to be aggregated have different AS_PATH attributes, then
   2529       the aggregated AS_PATH attribute shall satisfy all of the
   2530       following conditions:
   2531 
   2532          - all tuples of the type AS_SEQUENCE in the aggregated AS_PATH
   2533          shall appear in all of the AS_PATH in the initial set of routes
   2534          to be aggregated.
   2535 
   2536          - all tuples of the type AS_SET in the aggregated AS_PATH shall
   2537          appear in at least one of the AS_PATH in the initial set (they
   2538          may appear as either AS_SET or AS_SEQUENCE types).
   2539 
   2540          - for any tuple X of the type AS_SEQUENCE in the aggregated
   2541          AS_PATH which precedes tuple Y in the aggregated AS_PATH, X
   2542          precedes Y in each AS_PATH in the initial set which contains Y,
   2543          regardless of the type of Y.
   2544 
   2545          - No tuple with the same value shall appear more than once in
   2546          the aggregated AS_PATH, regardless of the tuple's type.
   2547 
   2548       An implementation may choose any algorithm which conforms to these
   2549       rules.  At a minimum a conformant implementation shall be able to
   2550       perform the following algorithm that meets all of the above
   2551       conditions:
   2552 
   2553          - determine the longest leading sequence of tuples (as defined
   2554          above) common to all the AS_PATH attributes of the routes to be
   2555          aggregated. Make this sequence the leading sequence of the
   2556          aggregated AS_PATH attribute.
   2557 
   2558          - set the type of the rest of the tuples from the AS_PATH
   2559          attributes of the routes to be aggregated to AS_SET, and append
   2560          them to the aggregated AS_PATH attribute.
   2561 
   2562          - if the aggregated AS_PATH has more than one tuple with the
   2563          same value (regardless of tuple's type), eliminate all, but one
   2564          such tuple by deleting tuples of the type AS_SET from the
   2565          aggregated AS_PATH attribute.
   2566 
   2567       Appendix 6, section 6.8 presents another algorithm that satisfies
   2568       the conditions and  allows for more complex policy configurations.
   2569 
   2570       ATOMIC_AGGREGATE: If at least one of the routes to be aggregated
   2571       has ATOMIC_AGGREGATE path attribute, then the aggregated route
   2572       shall have this attribute as well.
   2573 
   2574 
   2575 
   2576 
   2577 
   2578 Rekhter & Li                                                   [Page 46]
   2579 
   2580 RFC 1771                         BGP-4                        March 1995
   2581 
   2582 
   2583       AGGREGATOR: All AGGREGATOR attributes of all routes to be
   2584       aggregated should be ignored.
   2585 
   2586 9.3   Route Selection Criteria
   2587 
   2588    Generally speaking, additional rules for comparing routes among
   2589    several alternatives are outside the scope of this document.  There
   2590    are two exceptions:
   2591 
   2592       - If the local AS appears in the AS path of the new route being
   2593       considered, then that new route cannot be viewed as better than
   2594       any other route.  If such a route were ever used, a routing loop
   2595       would result.
   2596 
   2597       - In order to achieve successful distributed operation, only
   2598       routes with a likelihood of stability can be chosen.  Thus, an AS
   2599       must avoid using unstable routes, and it must not make rapid
   2600       spontaneous changes to its choice of route.  Quantifying the terms
   2601       "unstable" and "rapid" in the previous sentence will require
   2602       experience, but the principle is clear.
   2603 
   2604 9.4   Originating BGP routes
   2605 
   2606    A BGP speaker may originate BGP routes by injecting routing
   2607    information acquired by some other means (e.g. via an IGP) into BGP.
   2608    A BGP speaker that originates BGP routes shall assign the degree of
   2609    preference to these routes by passing them through the Decision
   2610    Process (see Section 9.1).  These routes may also be distributed to
   2611    other BGP speakers within the local AS as part of the Internal update
   2612    process (see Section 9.2.1). The decision whether to distribute non-
   2613    BGP acquired routes within an AS via BGP or not depends on the
   2614    environment within the AS (e.g. type of IGP) and should be controlled
   2615    via configuration.
   2616 
   2617 
   2618 
   2619 
   2620 
   2621 
   2622 
   2623 
   2624 
   2625 
   2626 
   2627 
   2628 
   2629 
   2630 
   2631 
   2632 
   2633 
   2634 Rekhter & Li                                                   [Page 47]
   2635 
   2636 RFC 1771                         BGP-4                        March 1995
   2637 
   2638 
   2639 Appendix 1.  BGP FSM State Transitions and Actions.
   2640 
   2641    This Appendix discusses the transitions between states in the BGP FSM
   2642    in response to BGP events.  The following is the list of these states
   2643    and events when the negotiated Hold Time value is non-zero.
   2644 
   2645        BGP States:
   2646 
   2647                 1 - Idle
   2648                 2 - Connect
   2649                 3 - Active
   2650                 4 - OpenSent
   2651                 5 - OpenConfirm
   2652                 6 - Established
   2653 
   2654        BGP Events:
   2655 
   2656                 1 - BGP Start
   2657                 2 - BGP Stop
   2658                 3 - BGP Transport connection open
   2659                 4 - BGP Transport connection closed
   2660                 5 - BGP Transport connection open failed
   2661                 6 - BGP Transport fatal error
   2662                 7 - ConnectRetry timer expired
   2663                 8 - Hold Timer expired
   2664                 9 - KeepAlive timer expired
   2665                10 - Receive OPEN message
   2666                11 - Receive KEEPALIVE message
   2667                12 - Receive UPDATE messages
   2668                13 - Receive NOTIFICATION message
   2669 
   2670 
   2671 
   2672 
   2673 
   2674 
   2675 
   2676 
   2677 
   2678 
   2679 
   2680 
   2681 
   2682 
   2683 
   2684 
   2685 
   2686 
   2687 
   2688 
   2689 
   2690 Rekhter & Li                                                   [Page 48]
   2691 
   2692 RFC 1771                         BGP-4                        March 1995
   2693 
   2694 
   2695    The following table describes the state transitions of the BGP FSM
   2696    and the actions triggered by these transitions.
   2697 
   2698 
   2699     Event                Actions               Message Sent   Next State
   2700     --------------------------------------------------------------------
   2701     Idle (1)
   2702      1            Initialize resources            none             2
   2703                   Start ConnectRetry timer
   2704                   Initiate a transport connection
   2705      others               none                    none             1
   2706 
   2707     Connect(2)
   2708      1                    none                    none             2
   2709      3            Complete initialization         OPEN             4
   2710                   Clear ConnectRetry timer
   2711      5            Restart ConnectRetry timer      none             3
   2712      7            Restart ConnectRetry timer      none             2
   2713                   Initiate a transport connection
   2714      others       Release resources               none             1
   2715 
   2716     Active (3)
   2717      1                    none                    none             3
   2718      3            Complete initialization         OPEN             4
   2719                   Clear ConnectRetry timer
   2720      5            Close connection                                 3
   2721                   Restart ConnectRetry timer
   2722      7            Restart ConnectRetry timer      none             2
   2723                   Initiate a transport connection
   2724      others       Release resources               none             1
   2725 
   2726     OpenSent(4)
   2727      1                    none                    none             4
   2728      4            Close transport connection      none             3
   2729                   Restart ConnectRetry timer
   2730      6            Release resources               none             1
   2731     10            Process OPEN is OK            KEEPALIVE          5
   2732                   Process OPEN failed           NOTIFICATION       1
   2733     others        Close transport connection    NOTIFICATION       1
   2734                   Release resources
   2735 
   2736 
   2737 
   2738 
   2739 
   2740 
   2741 
   2742 
   2743 
   2744 
   2745 
   2746 Rekhter & Li                                                   [Page 49]
   2747 
   2748 RFC 1771                         BGP-4                        March 1995
   2749 
   2750 
   2751     OpenConfirm (5)
   2752      1                   none                     none             5
   2753      4            Release resources               none             1
   2754      6            Release resources               none             1
   2755      9            Restart KeepAlive timer       KEEPALIVE          5
   2756     11            Complete initialization         none             6
   2757                   Restart Hold Timer
   2758     13            Close transport connection                       1
   2759                   Release resources
   2760     others        Close transport connection    NOTIFICATION       1
   2761                   Release resources
   2762 
   2763     Established (6)
   2764      1                   none                     none             6
   2765      4            Release resources               none             1
   2766      6            Release resources               none             1
   2767      9            Restart KeepAlive timer       KEEPALIVE          6
   2768     11            Restart Hold Timer            KEEPALIVE          6
   2769     12            Process UPDATE is OK          UPDATE             6
   2770                   Process UPDATE failed         NOTIFICATION       1
   2771     13            Close transport connection                       1
   2772                   Release resources
   2773     others        Close transport connection    NOTIFICATION       1
   2774                   Release resources
   2775    ---------------------------------------------------------------------
   2776 
   2777 
   2778 
   2779 
   2780 
   2781 
   2782 
   2783 
   2784 
   2785 
   2786 
   2787 
   2788 
   2789 
   2790 
   2791 
   2792 
   2793 
   2794 
   2795 
   2796 
   2797 
   2798 
   2799 
   2800 
   2801 
   2802 Rekhter & Li                                                   [Page 50]
   2803 
   2804 RFC 1771                         BGP-4                        March 1995
   2805 
   2806 
   2807       The following is a condensed version of the above state transition
   2808       table.
   2809 
   2810 
   2811    Events| Idle | Connect | Active | OpenSent | OpenConfirm | Estab
   2812          | (1)  |   (2)   |  (3)   |    (4)   |     (5)     |   (6)
   2813          |--------------------------------------------------------------
   2814     1    |  2   |    2    |   3    |     4    |      5      |    6
   2815          |      |         |        |          |             |
   2816     2    |  1   |    1    |   1    |     1    |      1      |    1
   2817          |      |         |        |          |             |
   2818     3    |  1   |    4    |   4    |     1    |      1      |    1
   2819          |      |         |        |          |             |
   2820     4    |  1   |    1    |   1    |     3    |      1      |    1
   2821          |      |         |        |          |             |
   2822     5    |  1   |    3    |   3    |     1    |      1      |    1
   2823          |      |         |        |          |             |
   2824     6    |  1   |    1    |   1    |     1    |      1      |    1
   2825          |      |         |        |          |             |
   2826     7    |  1   |    2    |   2    |     1    |      1      |    1
   2827          |      |         |        |          |             |
   2828     8    |  1   |    1    |   1    |     1    |      1      |    1
   2829          |      |         |        |          |             |
   2830     9    |  1   |    1    |   1    |     1    |      5      |    6
   2831          |      |         |        |          |             |
   2832    10    |  1   |    1    |   1    |  1 or 5  |      1      |    1
   2833          |      |         |        |          |             |
   2834    11    |  1   |    1    |   1    |     1    |      6      |    6
   2835          |      |         |        |          |             |
   2836    12    |  1   |    1    |   1    |     1    |      1      | 1 or 6
   2837          |      |         |        |          |             |
   2838    13    |  1   |    1    |   1    |     1    |      1      |    1
   2839          |      |         |        |          |             |
   2840          ---------------------------------------------------------------
   2841 
   2842 
   2843 Appendix 2. Comparison with RFC1267
   2844 
   2845    BGP-4 is capable of operating in an environment where a set of
   2846    reachable destinations may be expressed via a single IP prefix.  The
   2847    concept of network classes, or subnetting is foreign to BGP-4.  To
   2848    accommodate these capabilities BGP-4 changes semantics and encoding
   2849    associated with the AS_PATH attribute. New text has been added to
   2850    define semantics associated with IP prefixes.  These abilities allow
   2851    BGP-4 to support the proposed supernetting scheme [9].
   2852 
   2853    To simplify configuration this version introduces a new attribute,
   2854    LOCAL_PREF, that facilitates route selection procedures.
   2855 
   2856 
   2857 
   2858 Rekhter & Li                                                   [Page 51]
   2859 
   2860 RFC 1771                         BGP-4                        March 1995
   2861 
   2862 
   2863    The INTER_AS_METRIC attribute has been renamed to be MULTI_EXIT_DISC.
   2864    A new attribute, ATOMIC_AGGREGATE, has been introduced to insure that
   2865    certain aggregates are not de-aggregated.  Another new attribute,
   2866    AGGREGATOR, can be added to aggregate routes in order to advertise
   2867    which AS and which BGP speaker within that AS caused the aggregation.
   2868 
   2869    To insure that Hold Timers are symmetric, the Hold Time is now
   2870    negotiated on a per-connection basis.  Hold Times of zero are now
   2871    supported.
   2872 
   2873 Appendix 3.  Comparison with RFC 1163
   2874 
   2875    All of the changes listed in Appendix 2, plus the following.
   2876 
   2877    To detect and recover from BGP connection collision, a new field (BGP
   2878    Identifier) has been added to the OPEN message. New text (Section
   2879    6.8) has been added to specify the procedure for detecting and
   2880    recovering from collision.
   2881 
   2882    The new document no longer restricts the border router that is passed
   2883    in the NEXT_HOP path attribute to be part of the same Autonomous
   2884    System as the BGP Speaker.
   2885 
   2886    New document optimizes and simplifies the exchange of the information
   2887    about previously reachable routes.
   2888 
   2889 Appendix 4.  Comparison with RFC 1105
   2890 
   2891    All of the changes listed in Appendices 2 and 3, plus the following.
   2892 
   2893    Minor changes to the RFC1105 Finite State Machine were necessary to
   2894    accommodate the TCP user interface provided by 4.3 BSD.
   2895 
   2896    The notion of Up/Down/Horizontal relations present in RFC1105 has
   2897    been removed from the protocol.
   2898 
   2899    The changes in the message format from RFC1105 are as follows:
   2900 
   2901       1.  The Hold Time field has been removed from the BGP header and
   2902       added to the OPEN message.
   2903 
   2904       2.  The version field has been removed from the BGP header and
   2905       added to the OPEN message.
   2906 
   2907       3.  The Link Type field has been removed from the OPEN message.
   2908 
   2909       4.  The OPEN CONFIRM message has been eliminated and replaced with
   2910       implicit confirmation provided by the KEEPALIVE message.
   2911 
   2912 
   2913 
   2914 Rekhter & Li                                                   [Page 52]
   2915 
   2916 RFC 1771                         BGP-4                        March 1995
   2917 
   2918 
   2919       5.  The format of the UPDATE message has been changed
   2920       significantly.  New fields were added to the UPDATE message to
   2921       support multiple path attributes.
   2922 
   2923       6.  The Marker field has been expanded and its role broadened to
   2924       support authentication.
   2925 
   2926       Note that quite often BGP, as specified in RFC 1105, is referred
   2927       to as BGP-1, BGP, as specified in RFC 1163, is referred to as
   2928       BGP-2, BGP, as specified in RFC1267 is referred to as BGP-3, and
   2929       BGP, as specified in this document is referred to as BGP-4.
   2930 
   2931 Appendix 5.  TCP options that may be used with BGP
   2932 
   2933    If a local system TCP user interface supports TCP PUSH function, then
   2934    each BGP message should be transmitted with PUSH flag set.  Setting
   2935    PUSH flag forces BGP messages to be transmitted promptly to the
   2936    receiver.
   2937 
   2938    If a local system TCP user interface supports setting precedence for
   2939    TCP connection, then the BGP transport connection should be opened
   2940    with precedence set to Internetwork Control (110) value (see also
   2941    [6]).
   2942 
   2943 Appendix 6.  Implementation Recommendations
   2944 
   2945    This section presents some implementation recommendations.
   2946 
   2947 6.1 Multiple Networks Per Message
   2948 
   2949    The BGP protocol allows for multiple address prefixes with the same
   2950    AS path and next-hop gateway to be specified in one message. Making
   2951    use of this capability is highly recommended. With one address prefix
   2952    per message there is a substantial increase in overhead in the
   2953    receiver. Not only does the system overhead increase due to the
   2954    reception of multiple messages, but the overhead of scanning the
   2955    routing table for updates to BGP peers and other routing protocols
   2956    (and sending the associated messages) is incurred multiple times as
   2957    well. One method of building messages containing many address
   2958    prefixes per AS path and gateway from a routing table that is not
   2959    organized per AS path is to build many messages as the routing table
   2960    is scanned. As each address prefix is processed, a message for the
   2961    associated AS path and gateway is allocated, if it does not exist,
   2962    and the new address prefix is added to it.  If such a message exists,
   2963    the new address prefix is just appended to it. If the message lacks
   2964    the space to hold the new address prefix, it is transmitted, a new
   2965    message is allocated, and the new address prefix is inserted into the
   2966    new message. When the entire routing table has been scanned, all
   2967 
   2968 
   2969 
   2970 Rekhter & Li                                                   [Page 53]
   2971 
   2972 RFC 1771                         BGP-4                        March 1995
   2973 
   2974 
   2975    allocated messages are sent and their resources released.  Maximum
   2976    compression is achieved when all  the destinations covered by the
   2977    address prefixes share a gateway and common path attributes, making
   2978    it possible to send many address prefixes in one 4096-byte message.
   2979 
   2980    When peering with a BGP implementation that does not compress
   2981    multiple address prefixes into one message, it may be necessary to
   2982    take steps to reduce the overhead from the flood of data received
   2983    when a peer is acquired or a significant network topology change
   2984    occurs. One method of doing this is to limit the rate of updates.
   2985    This will eliminate the redundant scanning of the routing table to
   2986    provide flash updates for BGP peers and other routing protocols. A
   2987    disadvantage of this approach is that it increases the propagation
   2988    latency of routing information.  By choosing a minimum flash update
   2989    interval that is not much greater than the time it takes to process
   2990    the multiple messages this latency should be minimized. A better
   2991    method would be to read all received messages before sending updates.
   2992 
   2993 6.2  Processing Messages on a Stream Protocol
   2994 
   2995    BGP uses TCP as a transport mechanism.  Due to the stream nature of
   2996    TCP, all the data for received messages does not necessarily arrive
   2997    at the same time. This can make it difficult to process the data as
   2998    messages, especially on systems such as BSD Unix where it is not
   2999    possible to determine how much data has been received but not yet
   3000    processed.
   3001 
   3002    One method that can be used in this situation is to first try to read
   3003    just the message header. For the KEEPALIVE message type, this is a
   3004    complete message; for other message types, the header should first be
   3005    verified, in particular the total length. If all checks are
   3006    successful, the specified length, minus the size of the message
   3007    header is the amount of data left to read. An implementation that
   3008    would "hang" the routing information process while trying to read
   3009    from a peer could set up a message buffer (4096 bytes) per peer and
   3010    fill it with data as available until a complete message has been
   3011    received.
   3012 
   3013 6.3 Reducing route flapping
   3014 
   3015    To avoid excessive route flapping a BGP speaker which needs to
   3016    withdraw a destination and send an update about a more specific or
   3017    less specific route shall combine them into the same UPDATE message.
   3018 
   3019 
   3020 
   3021 
   3022 
   3023 
   3024 
   3025 
   3026 Rekhter & Li                                                   [Page 54]
   3027 
   3028 RFC 1771                         BGP-4                        March 1995
   3029 
   3030 
   3031 6.4 BGP Timers
   3032 
   3033    BGP employs five timers: ConnectRetry, Hold Time, KeepAlive,
   3034    MinASOriginationInterval, and MinRouteAdvertisementInterval The
   3035    suggested value for the ConnectRetry timer is 120 seconds.  The
   3036    suggested value for the Hold Time is 90 seconds.  The suggested value
   3037    for the KeepAlive timer is 30 seconds.  The suggested value for the
   3038    MinASOriginationInterval is 15 seconds.  The suggested value for the
   3039    MinRouteAdvertisementInterval is 30 seconds.
   3040 
   3041    An implementation of BGP MUST allow these timers to be configurable.
   3042 
   3043 6.5 Path attribute ordering
   3044 
   3045    Implementations which combine update messages as described above in
   3046    6.1 may prefer to see all path attributes presented in a known order.
   3047    This permits them to quickly identify sets of attributes from
   3048    different update messages which are semantically identical.  To
   3049    facilitate this, it is a useful optimization to order the path
   3050    attributes according to type code.  This optimization is entirely
   3051     optional.
   3052 
   3053 6.6 AS_SET sorting
   3054 
   3055    Another useful optimization that can be done to simplify this
   3056    situation is to sort the AS numbers found in an AS_SET.  This
   3057    optimization is entirely optional.
   3058 
   3059 6.7 Control over version negotiation
   3060 
   3061    Since BGP-4 is capable of carrying aggregated routes which cannot be
   3062    properly represented in BGP-3, an implementation which supports BGP-4
   3063    and another BGP version should provide the capability to only speak
   3064    BGP-4 on a per-peer basis.
   3065 
   3066 6.8 Complex AS_PATH aggregation
   3067 
   3068    An implementation which chooses to provide a path aggregation
   3069    algorithm which retains significant amounts of path information may
   3070    wish to use the following procedure:
   3071 
   3072       For the purpose of aggregating AS_PATH attributes of two routes,
   3073       we model each AS as a tuple <type, value>, where "type" identifies
   3074       a type of the path segment the AS belongs to (e.g.  AS_SEQUENCE,
   3075       AS_SET), and "value" is the AS number.  Two ASs are said to be the
   3076       same if their corresponding <type, value> tuples are the same.
   3077 
   3078 
   3079 
   3080 
   3081 
   3082 Rekhter & Li                                                   [Page 55]
   3083 
   3084 RFC 1771                         BGP-4                        March 1995
   3085 
   3086 
   3087       The algorithm to aggregate two AS_PATH attributes works as
   3088       follows:
   3089 
   3090          a) Identify the same ASs (as defined above) within each AS_PATH
   3091          attribute that are in the same relative order within both
   3092          AS_PATH attributes.  Two ASs, X and Y, are said to be in the
   3093          same order if either:
   3094 
   3095             - X precedes Y in both AS_PATH attributes, or - Y precedes X
   3096             in both AS_PATH attributes.
   3097 
   3098          b) The aggregated AS_PATH attribute consists of ASs identified
   3099          in (a) in exactly the same order as they appear in the AS_PATH
   3100          attributes to be aggregated. If two consecutive ASs identified
   3101          in (a) do not immediately follow each other in both of the
   3102          AS_PATH attributes to be aggregated, then the intervening ASs
   3103          (ASs that are between the two consecutive ASs that are the
   3104          same) in both attributes are combined into an AS_SET path
   3105          segment that consists of the intervening ASs from both AS_PATH
   3106          attributes; this segment is then placed in between the two
   3107          consecutive ASs identified in (a) of the aggregated attribute.
   3108          If two consecutive ASs identified in (a) immediately follow
   3109          each other in one attribute, but do not follow in another, then
   3110          the intervening ASs of the latter are combined into an AS_SET
   3111          path segment; this segment is then placed in between the two
   3112          consecutive ASs identified in (a) of the aggregated attribute.
   3113 
   3114       If as a result of the above procedure a given AS number appears
   3115       more than once within the aggregated AS_PATH attribute, all, but
   3116       the last instance (rightmost occurrence) of that AS number should
   3117       be removed from the aggregated AS_PATH attribute.
   3118 
   3119 References
   3120 
   3121    [1] Mills, D., "Exterior Gateway Protocol Formal Specification", RFC
   3122        904, BBN, April 1984.
   3123 
   3124    [2] Rekhter, Y., "EGP and Policy Based Routing in the New NSFNET
   3125        Backbone", RFC 1092, T.J. Watson Research Center, February 1989.
   3126 
   3127    [3] Braun, H-W., "The NSFNET Routing Architecture", RFC 1093,
   3128        MERIT/NSFNET Project, February 1989.
   3129 
   3130    [4] Postel, J., "Transmission Control Protocol - DARPA Internet
   3131        Program Protocol Specification", STD 7, RFC 793, DARPA, September
   3132        1981.
   3133 
   3134 
   3135 
   3136 
   3137 
   3138 Rekhter & Li                                                   [Page 56]
   3139 
   3140 RFC 1771                         BGP-4                        March 1995
   3141 
   3142 
   3143    [5] Rekhter, Y., and P. Gross, "Application of the Border Gateway
   3144        Protocol in the Internet", RFC 1772, T.J. Watson Research Center,
   3145        IBM Corp., MCI, March 1995.
   3146 
   3147    [6] Postel, J., "Internet Protocol - DARPA Internet Program Protocol
   3148        Specification", STD 5, RFC 791, DARPA, September 1981.
   3149 
   3150    [7] "Information Processing Systems - Telecommunications and
   3151        Information Exchange between Systems - Protocol for Exchange of
   3152        Inter-domain Routeing Information among Intermediate Systems to
   3153        Support Forwarding of ISO 8473 PDUs", ISO/IEC IS10747, 1993
   3154 
   3155    [8] Fuller, V., Li, T., Yu, J., and K. Varadhan, "Classless Inter-
   3156        Domain Routing (CIDR): an Address Assignment and Aggregation
   3157        Strategy", RFC 1519, BARRNet, cisco, MERIT, OARnet, September
   3158        1993
   3159 
   3160    [9] Rekhter, Y., Li, T., "An Architecture for IP Address Allocation
   3161        with CIDR", RFC 1518, T.J. Watson Research Center, cisco,
   3162        September 1993
   3163 
   3164 Security Considerations
   3165 
   3166    Security issues are not discussed in this document.
   3167 
   3168 Editors' Addresses
   3169 
   3170    Yakov Rekhter
   3171    T.J. Watson Research Center IBM Corporation
   3172    P.O. Box 704, Office H3-D40
   3173    Yorktown Heights, NY 10598
   3174 
   3175    Phone:  +1 914 784 7361
   3176    EMail:  yakov@watson.ibm.com
   3177 
   3178 
   3179    Tony Li
   3180    cisco Systems, Inc.
   3181    170 W. Tasman Dr.
   3182    San Jose, CA 95134
   3183 
   3184    EMail: tli@cisco.com
   3185 
   3186 
   3187 
   3188 
   3189 
   3190 
   3191 
   3192 
   3193 
   3194 Rekhter & Li                                                   [Page 57]
   3195