Discussion:
[j-nsp] Network automation vs. manual config
Antti Ristimäki
2018-08-17 10:54:27 UTC
Permalink
Hi colleagues,

This is something that I've been thinking quite a lot, so I would be delighted to hear some comments, experiences or recommendations.

So, now that more and more of us are automating their network, there will be the question about how to manage the configurations, if they are partially automated and partially manually maintained. This will be the case especially while transitioning from a pure CLI jockey network towards a more automated one. There are probably multiple approaches to solve this, but below are a few of them:

One option is to generate the whole config automatically e.g. from a template or a database and just _not_accepting_ any manual configurations at all. Then when there are needs to do something custom not yet supported by the automation tools, instead of manually configuring it one would take some additional time and build the support into the automation tools. The cost for this might be that deploying something new/custom/tailor-made might take a bit more time compared to just manually configuring it, but in a long run the benefits are obvious. I'm personally preferring this approach.

Generating the _whole_ configuration automatically off-line from the scratch makes it also easy to remove elements from the configuration, as the auto-generated config can completely replace the existing running-config.

If the above mentioned is not doable for the entire configuration, one can take one configuration hierarchy level at a time and automate it, after which no manual configurations will be accepted under that hierarchy. This is rather trivial especially for those configuration hierarchies that tend to be static most of the time.

Another option is to apply the auto-generated configuration via apply-groups and apply all manual configurations explicitly so that the automatic and manual configurations merge with each other. The positive side of this approach is that it makes easy to develop the automation tools so that manual configs are not overridden by auto-generated config, but I personally see somewhat inconvenient that one really doesn't see the effective running-config when using apply-groups, unless one remembers to display inheritance.

Any thoughts appreciated.

Antti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Jason Lixfeld
2018-08-17 11:45:12 UTC
Permalink
I’ll admit that I haven’t done much automation yet, so take this with a grain of salt and provide clue where required...
Post by Antti Ristimäki
Hi colleagues,
This is something that I've been thinking quite a lot, so I would be delighted to hear some comments, experiences or recommendations.
One option is to generate the whole config automatically e.g. from a template or a database and just _not_accepting_ any manual configurations at all. Then when there are needs to do something custom not yet supported by the automation tools, instead of manually configuring it one would take some additional time and build the support into the automation tools. The cost for this might be that deploying something new/custom/tailor-made might take a bit more time compared to just manually configuring it, but in a long run the benefits are obvious. I'm personally preferring this approach.
Generating the _whole_ configuration automatically off-line from the scratch makes it also easy to remove elements from the configuration, as the auto-generated config can completely replace the existing running-config.
Maybe I’m missing an implied exception, but every once in a while one needs to make some sort of manual configuration to resolve a time sensitive some corner case that the provisioning system doesn’t support because someone external to you (ie: customer, IXP participant) changed something. How is that handled in this use case?
Post by Antti Ristimäki
If the above mentioned is not doable for the entire configuration, one can take one configuration hierarchy level at a time and automate it, after which no manual configurations will be accepted under that hierarchy. This is rather trivial especially for those configuration hierarchies that tend to be static most of the time.
Another option is to apply the auto-generated configuration via apply-groups and apply all manual configurations explicitly so that the automatic and manual configurations merge with each other. The positive side of this approach is that it makes easy to develop the automation tools so that manual configs are not overridden by auto-generated config, but I personally see somewhat inconvenient that one really doesn't see the effective running-config when using apply-groups, unless one remembers to display inheritance.
Any thoughts appreciated.
I tend to agree with you in that apply-groups can make things really hard to follow and make the config explode in size when you try to display inheritance. So maybe it makes sense at this point to just ignore the CLI all together? If you have a tool that is going to write your configs with apply-groups our whatever, it can probably display that configuration in whatever format you want so the CLI becomes somewhat obsolete for config review.

And, if you have all that in place, I’m sure this display interface can, and would likely be a single pane of glass for all of your configurations. It would display anything you throw at it in exactly the same format, no matter what sort of device it’s for.

I can certainly see the value in that.
Post by Antti Ristimäki
Antti
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/
Job Snijders
2018-08-17 11:54:48 UTC
Permalink
Post by Jason Lixfeld
Maybe I’m missing an implied exception, but every once in a while one
needs to make some sort of manual configuration to resolve a time
sensitive some corner case that the provisioning system doesn’t
support because someone external to you (ie: customer, IXP
participant) changed something. How is that handled in this use case?
Yes, one should create a mechanism that somewhere in the pipeline you
can override configuration the system generated. We call these overrides
'hacks'. We track hacks in a version controlled repository, and I'd like
to suggest they should be used sparingly.

Kind regards,

Job
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puc
Job Snijders
2018-08-17 11:51:23 UTC
Permalink
Dear Antti,
Post by Antti Ristimäki
This is something that I've been thinking quite a lot, so I would be
delighted to hear some comments, experiences or recommendations.
So, now that more and more of us are automating their network, there
will be the question about how to manage the configurations, if they
are partially automated and partially manually maintained. This will
be the case especially while transitioning from a pure CLI jockey
network towards a more automated one. There are probably multiple
One option is to generate the whole config automatically e.g. from a
template or a database and just _not_accepting_ any manual
configurations at all. Then when there are needs to do something
custom not yet supported by the automation tools, instead of manually
configuring it one would take some additional time and build the
support into the automation tools. The cost for this might be that
deploying something new/custom/tailor-made might take a bit more time
compared to just manually configuring it, but in a long run the
benefits are obvious. I'm personally preferring this approach.
Generating the _whole_ configuration automatically off-line from the
scratch makes it also easy to remove elements from the configuration,
as the auto-generated config can completely replace the existing
running-config.
The above paragraph is in a nutshell how NTT's AS 2914 network operates.
I can strongly recommend the approach.

A presentation is available here:

https://ripe69.ripe.net/wp-content/uploads/presentations/29-201411-ripe.pdf
https://ripe69.ripe.net/archives/video/178/

Kind regards,

Job
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Michael Still
2018-08-17 13:06:09 UTC
Permalink
Side note on apply groups and display inheritance. I've submitted a Juniper
ER for an enhancement to have the ability to have ' | display inheritance'
a 'default' cli behavior (configurable via 'set cli display-inheritance'
option that is defaulted to off). I've also asked for a login-class option
to enable this for specific user role such as front line NOC users who may
benefit from having it on by default. This is ER-077163 if you want to poke
your Juniper SE about it.

The reason I've asked for this is specifically because I've seen NOC
personnel spend many cycles investigating an issue not realizing that
particular hidden apply-group config was affecting their investigation.

I have a couple other semi-related (to automation / configuration
enhancement) ER's going if folks are interested and would like to chat
about those directly.
Post by Antti Ristimäki
Post by Antti Ristimäki
Another option is to apply the auto-generated configuration via
apply-groups and apply all manual configurations explicitly so that the
automatic and manual configurations merge with each other. The positive
side of this approach is that it makes easy to develop the automation tools
so that manual configs are not overridden by auto-generated config, but I
personally see somewhat inconvenient that one really doesn't see the
effective running-config when using apply-groups, unless one remembers to
display inheritance.
We’ve implemented this at a network I support, seems to be going well. We
approach it slightly differently though, in a way which may help solve your
usability problem, in a bit of a roundabout way. In short, we build groups
in to almost everything so people are used to doing display inheritance if
they need to look deeper at things. It’s not perfect, but it’s the best way
I’ve found to manage large bits of JunOS config.
Global* - common on every router they exist on, applied at top level only
Local* - unique to this router, applied at any level
* - common on every router they exist on, applied at any level
All our groups have apply-flags omit;
Local* groups are only used when something is re-used several times on the
one router - for example on our BNGs, a list of DHCP interfaces in each of
the routing-instances we might push a subscriber in to.
- GlobalDualREMX sets up whatever our standard things are for an MX with
2 REs, applied at top level.
- “MPLS" is applied at `interfaces blah` and `protocols rsvp interface
blah`, etc and includes our per-interface MPLS config.
- VRFCustomers includes our import/export policies for our Customers VRF
(applied inside a routing-instance), and the loopback filter config for the
Customers VRF loopback (applied inside an interface).
The only config that’s outside groups is config unique to that router -
so, IP addressing, routing-instance names and RDs, interfaces (though they
have apply-groups within them for many settings), hostname, etc.
1) Config is short because of apply-flags omit. Seeing things unique to
this router is easy. It’s easy to spot differences as apply-groups are
different - and that’s all you generally need to look for. I just looked,
our BNGs are all about 500 lines of config, and all have identical group
config on them. Most of the config is rsvp-te tunnels, and access network
interfaces.
2) When we want to look deeper, we know to do `| display inheritance |
except #` and it becomes muscle memory - this really is the bit that helps
your use case, haha.
3) We can copy our groups from a git repository, load replace (in our git
reply they all have replace tags) and commit. Keeping the common config
consistent is super easy. Automating this is one “leg” of automation and
solves almost all of our automation requirements.
4) We can do bespoke mucking about outside the groups, and it’s obvious
what those things are, and what things need to be tidied up in to groups,
or what is junk temp config that needs to be thrown out.
I think where this could work for you, is to have your automation apply
any router-specific config just like a human would - outside the groups,
but leveraging the groups as much as possible. If you want to keep your
manual/automated config seperate, stick the automated config in a big
single group - that way, manual config will override it, and it’ll be very
clear that it’s there and where it’s come from.
--
Nathan Ward
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp
--
[***@gmail.com ~]$ cat .signature
cat: .signature: No such file or directory
[***@gmail.com ~]$
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/junipe
Niall Donaghy
2018-08-17 14:15:30 UTC
Permalink
Hi Antti, folks,

@Antti: Feel free to reach out directly if we can be of assistance. I understand you are in CSC behind FUNET, connected to GÉANT?

Here in GÉANT we have 31 x MX480/960 routers, all acting as PE devices (no P devices), spanning Europe.

We run a large set of protocols and services (ie: some logical systems, many routing-instances, carrier-of-carriers, dual-stack, LDP, RSVP-TE, MSDP, PIM, etc. etc.).
We shift over 1 Tbps and though our number of 'customers' is few - maybe 5-10 homed per box - we're running upto 50,000 lines of config on some boxes.

So where do we stand on config automation?

Whilst we do use configuration templates, our ['customers'] requirements necessitate some exceptions in places.
Given our central position in connecting EU Research and Education networks together, and to the world, we are running quite a mix of services - production, pilot, experimental - and manual configuration direct on the CLI is the only game in town; why automate disposable config?
** Not to be confused with pure lab work - we have several labs, too, and appropriate divisions between lab and production.

We are moving toward Ansible/git/Napalm/Bash glue scripts for chunks of configuration which seldom change, eg: chassis, routing-options, snmp, standard policies and filters, etc.
IE: We're going to automate the low-hanging fruit first, and expand from there.

RE: manual overwrites - What I'm going to POC is using the Junos 'protect' feature to block CLI users from futzing with what lives in git: when git repo is pushed to the routers, we'll unprotect and re-protect those stanzas. So, in an out of hours emergency, our NOC can still unprotect and overwrite anything they need to.
Alternatively, fixes they may wish to implement can be updated in git. The key thing at the outset is choice - you can do it the way you're used to, while you learn the new procedures, and there is no negative impact.

To ease the migration, learning, training, we plan to start slow and have the git push triggered by hand, rather than, say, cron.
We will have quasi-realtime automated diff reports so deviations are spotted same-day and can be addressed.
The idea is anyone making a change updates git then does a push (which also verifies).

Until we have that, we continue with our partial automation:

I've authored numerous scripts - the most commonly used have a web frontend - which take user input, populate templates, and offer to push to the chosen router(s).
For instance, all public and private peerings are 100% automated and populate data from peeringdb.com.

NB: The above is our current position and plan of action. Please consider our needs are different from most SP networks.
In a more commercial operation with large scale cookie-cutter customers (who don't get special treatment - just a service catalogue), database-is-master is the way to go.

I'll finish by saying that the FOSS tools out there do a fantastic job - pick the toolchain you like/can understand, and don't be afraid to use it!

Off-topic: Almost all change management is performed by automation - config application and verification checks.
This means during the PM window, we need only concentrate on verification and 'what went wrong', should something happen.
IE: We don't burn brainpower or time making the changes - all that is done weeks in advance, and peer-reviewed if appropriate.


Br,
Niall

Niall Donaghy
Senior Network Engineer
GÉANT
T: +44 (0)1223 371393
M: +44 (0) 7557770303
Skype: niall.donaghy-dante
PGP Key ID: 0x77680027
nic-hdl: NGD-RIPE

Networks • Services • People
Learn more at www.geant.org​
​​
GÉANT Vereniging (Association) is registered with the Chamber of Commerce in Amsterdam with registration number 40535155 and operates in the UK as a branch of GÉANT Vereniging. Registered office: Hoekenrode 3, 1102BR Amsterdam, The Netherlands. UK branch address: City House, 126-130 Hills Road, Cambridge CB2 1PQ, UK.




-----Original Message-----
From: juniper-nsp [mailto:juniper-nsp-***@puck.nether.net] On Behalf Of Michael Still
Sent: 17 August 2018 14:06
To: juniper-***@puck.nether.net
Subject: Re: [j-nsp] Network automation vs. manual config

Side note on apply groups and display inheritance. I've submitted a Juniper ER for an enhancement to have the ability to have ' | display inheritance'
a 'default' cli behavior (configurable via 'set cli display-inheritance'
option that is defaulted to off). I've also asked for a login-class option to enable this for specific user role such as front line NOC users who may benefit from having it on by default. This is ER-077163 if you want to poke your Juniper SE about it.

The reason I've asked for this is specifically because I've seen NOC personnel spend many cycles investigating an issue not realizing that particular hidden apply-group config was affecting their investigation.

I have a couple other semi-related (to automation / configuration
enhancement) ER's going if folks are interested and would like to chat about those directly.
Post by Antti Ristimäki
Post by Antti Ristimäki
Another option is to apply the auto-generated configuration via
apply-groups and apply all manual configurations explicitly so that
the automatic and manual configurations merge with each other. The
positive side of this approach is that it makes easy to develop the
automation tools so that manual configs are not overridden by
auto-generated config, but I personally see somewhat inconvenient that
one really doesn't see the effective running-config when using
apply-groups, unless one remembers to display inheritance.
We’ve implemented this at a network I support, seems to be going well.
We approach it slightly differently though, in a way which may help
solve your usability problem, in a bit of a roundabout way. In short,
we build groups in to almost everything so people are used to doing
display inheritance if they need to look deeper at things. It’s not
perfect, but it’s the best way I’ve found to manage large bits of JunOS config.
Global* - common on every router they exist on, applied at top level only
Local* - unique to this router, applied at any level
* - common on every router they exist on, applied at any level
All our groups have apply-flags omit;
Local* groups are only used when something is re-used several times on
the one router - for example on our BNGs, a list of DHCP interfaces in
each of the routing-instances we might push a subscriber in to.
- GlobalDualREMX sets up whatever our standard things are for an MX with
2 REs, applied at top level.
- “MPLS" is applied at `interfaces blah` and `protocols rsvp
interface blah`, etc and includes our per-interface MPLS config.
- VRFCustomers includes our import/export policies for our Customers
VRF (applied inside a routing-instance), and the loopback filter
config for the Customers VRF loopback (applied inside an interface).
The only config that’s outside groups is config unique to that router
- so, IP addressing, routing-instance names and RDs, interfaces
(though they have apply-groups within them for many settings), hostname, etc.
1) Config is short because of apply-flags omit. Seeing things unique
to this router is easy. It’s easy to spot differences as apply-groups
are different - and that’s all you generally need to look for. I just
looked, our BNGs are all about 500 lines of config, and all have
identical group config on them. Most of the config is rsvp-te tunnels,
and access network interfaces.
2) When we want to look deeper, we know to do `| display inheritance |
except #` and it becomes muscle memory - this really is the bit that
helps your use case, haha.
3) We can copy our groups from a git repository, load replace (in our
git reply they all have replace tags) and commit. Keeping the common
config consistent is super easy. Automating this is one “leg” of
automation and solves almost all of our automation requirements.
4) We can do bespoke mucking about outside the groups, and it’s
obvious what those things are, and what things need to be tidied up in
to groups, or what is junk temp config that needs to be thrown out.
I think where this could work for you, is to have your automation
apply any router-specific config just like a human would - outside the
groups, but leveraging the groups as much as possible. If you want to
keep your manual/automated config seperate, stick the automated config
in a big single group - that way, manual config will override it, and
it’ll be very clear that it’s there and where it’s come from.
--
Nathan Ward
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp
a***@netconsultings.com
2018-08-19 17:33:38 UTC
Permalink
Of Antti Ristimäki
Sent: Sunday, August 19, 2018 4:17 PM
To: juniper-nsp
Subject: Re: [j-nsp] Network automation vs. manual config
Hi,
Thank you all for your comments both on and off the list. A lot of food for
thoughts. I can see that many of you have been evidently thinking about the
same dilemmas. Niall, yes it is Funet we are talking about, although not
directly conneted go Geant but to NORDUnet instead. I might contact you
unicast later.
There has been some comments about using (or not using) apply-groups for
managing and organizing the configuration. As discussed earlier, they might
cause quite some confusion when used incorrectly.
Automation is a huge project and it would be a pity to tailor all your automation around a specific vendor niche.
Same applies for the service building block templates -I think it's better to have these defined in openconfig format and push as such or translate to vendor specific format for non-standardized models.
One more comment. Depending on which tools are being used, it might be
rather easy to add new configuration to the network, but the system should
also support removing unnecessary elements from the configuration. This is
where replacing the entire configuration or at least entire configuration
hierarchy helps, as the unused configuration won't be added to the config in
the first place and no separate cleaning task is needed.
That would be handled by the config reconciliation routine, where the operational data store is compared to the config/intent data store and operator is presented with a choice to copy oper.>intent or vice versa.
Also its better if the tool applies just the delta not the whole config.
And also it's nice to have the option to perform the operation as atomic (i.e. successful on all nodes otherwise rollback)

adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::


_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.
Michael Lee
2018-08-18 01:29:44 UTC
Permalink
We have daily work to configure basic load balancer, customer certs and open firewalls, daily 5-15 tickets, without some sort of automation that will be waste a lot of resources for Sr people. So I used ansible and some python and shell script.

Also considering to use Yaml, Jinja2 for standard template for nexus switch base provisioning, standardize tacacs, snmp, features, not, and some ACLs

Just a thought

Sent from my iPhone
Post by Michael Still
Side note on apply groups and display inheritance. I've submitted a Juniper
ER for an enhancement to have the ability to have ' | display inheritance'
a 'default' cli behavior (configurable via 'set cli display-inheritance'
option that is defaulted to off). I've also asked for a login-class option
to enable this for specific user role such as front line NOC users who may
benefit from having it on by default. This is ER-077163 if you want to poke
your Juniper SE about it.
The reason I've asked for this is specifically because I've seen NOC
personnel spend many cycles investigating an issue not realizing that
particular hidden apply-group config was affecting their investigation.
I have a couple other semi-related (to automation / configuration
enhancement) ER's going if folks are interested and would like to chat
about those directly.
Post by Antti Ristimäki
Post by Antti Ristimäki
Another option is to apply the auto-generated configuration via
apply-groups and apply all manual configurations explicitly so that the
automatic and manual configurations merge with each other. The positive
side of this approach is that it makes easy to develop the automation tools
so that manual configs are not overridden by auto-generated config, but I
personally see somewhat inconvenient that one really doesn't see the
effective running-config when using apply-groups, unless one remembers to
display inheritance.
We’ve implemented this at a network I support, seems to be going well. We
approach it slightly differently though, in a way which may help solve your
usability problem, in a bit of a roundabout way. In short, we build groups
in to almost everything so people are used to doing display inheritance if
they need to look deeper at things. It’s not perfect, but it’s the best way
I’ve found to manage large bits of JunOS config.
Global* - common on every router they exist on, applied at top level only
Local* - unique to this router, applied at any level
* - common on every router they exist on, applied at any level
All our groups have apply-flags omit;
Local* groups are only used when something is re-used several times on the
one router - for example on our BNGs, a list of DHCP interfaces in each of
the routing-instances we might push a subscriber in to.
- GlobalDualREMX sets up whatever our standard things are for an MX with
2 REs, applied at top level.
- “MPLS" is applied at `interfaces blah` and `protocols rsvp interface
blah`, etc and includes our per-interface MPLS config.
- VRFCustomers includes our import/export policies for our Customers VRF
(applied inside a routing-instance), and the loopback filter config for the
Customers VRF loopback (applied inside an interface).
The only config that’s outside groups is config unique to that router -
so, IP addressing, routing-instance names and RDs, interfaces (though they
have apply-groups within them for many settings), hostname, etc.
1) Config is short because of apply-flags omit. Seeing things unique to
this router is easy. It’s easy to spot differences as apply-groups are
different - and that’s all you generally need to look for. I just looked,
our BNGs are all about 500 lines of config, and all have identical group
config on them. Most of the config is rsvp-te tunnels, and access network
interfaces.
2) When we want to look deeper, we know to do `| display inheritance |
except #` and it becomes muscle memory - this really is the bit that helps
your use case, haha.
3) We can copy our groups from a git repository, load replace (in our git
reply they all have replace tags) and commit. Keeping the common config
consistent is super easy. Automating this is one “leg” of automation and
solves almost all of our automation requirements.
4) We can do bespoke mucking about outside the groups, and it’s obvious
what those things are, and what things need to be tidied up in to groups,
or what is junk temp config that needs to be thrown out.
I think where this could work for you, is to have your automation apply
any router-specific config just like a human would - outside the groups,
but leveraging the groups as much as possible. If you want to keep your
manual/automated config seperate, stick the automated config in a big
single group - that way, manual config will override it, and it’ll be very
clear that it’s there and where it’s come from.
--
Nathan Ward
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp
--
cat: .signature: No such file or directory
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listin
Thomas Bellman
2018-08-19 12:43:55 UTC
Permalink
I would be interested in a way to build a command alias with
`| display inheritance | display commit-scripts | display omit | exclude #`
or something - `exclude #` isn’t the best either, as # is often in int
description etc.
Slightly aside: instead of using exclude to get rid of the inheritance
comments, why not use the existing variants of 'display inheritance'?
Add one of the keywords 'brief', 'terse' or 'no-comments' to get less
and less amount of comments about where things were inherited from.
I.e, 'show configuration | display inheritance no-comments'.

/Bellman
Pavel Lunin
2018-08-19 13:34:43 UTC
Permalink
This post might be inappropriate. Click to display it.
Pshem Kowalczyk
2018-08-20 00:56:38 UTC
Permalink
Hi,

We've stared on the automation journey some time ago. These are just some
generic subjects that you might have to think about:

1. What to automate
There are many different types of 'automation' out there. We currently
concentrate on 'orchestration' of product instances and on 'automation' of
various operational tasks. That means that the 'base' configuration of any
device is assumed, so for example all the policers and shapers we use for
our customers are already pre-defined and can be used.

2. Orchestration - model
The key for us is the Product-Service-Resource model. In that approach each
product has a defined list of parameters (and their values) and consists of
a number of services (in networking world, for example a L3VPN product
would have services such as access interface, QoS, routing instance etc).
Both product and services are 'abstract' and don't reflect any
device/model. Services use 'resources' as a way of implementing the
configuration on individual devices. This allows for reusability of code,
but also allows for products that live across multiple domains. For example
- one product can contain a number of routers, switches, firewalls and
applications. Some might be provisioned using SSH/CLI scrubbing, some using
APIs. Currently we only generate the instance configuration (for example a
L3VPN), and not the base configuration. Base configuration in our case is
'automation' - like adding new PE to a network and is subject to different
rules.
We use ansible as the engine, with multiple modules on top (including our
own ones).

3. Handling errors
Things go wrong even when you automate them. When a product instance uses
resources across 10 devices (and takes 30 minutes to fully roll out) there
must be a reliable roll-back process available. We don't relay on the
devices to do it (as the config could have been changed by something else
already) but instead we pre-generate 'reversal' config that we deploy if we
run into problems. In case of upgrades that config reverts to previously
known good state, for new installation it simply removes deployed config.

4. Making updates
When a customer wants to upgrade their product from 500Mb/s to 1Gb/s on the
access layer - how do you do it? In our case we hold 'instance data' which
is the set of input values of the product parameters, any change to that
set cuases all the configs to be regenerated and reprovisioned (details are
down to individual devices, some actually roll-out the configs, even if its
not different, some not)

5. Logging/reporting
All automated operations must be logged, including the changes they make to
all systems. High level reporting (on number of failures/successes) across
devices/types etc helps to pin problems quickly.

6. Dealing with shared resources
Sometimes making changes means changing objects that might already be
configured. For example creating a unit on an interface that requires
particular encapsulation on the interface. The easiest way to deal with
this is to standardise all shared resource, but we found it's not always
possible.

7. Good inventory system
You need a way of storing all the information about your network and
systems, also ability to automatically allocate things like VLANs, IPs etc.
All of that must be available over an API.
We also store what we call 'instance data' - all the parameters that are
used to create the instance of the product on all devices.

8. Change process that allows for 'automatic deployments'
If you currently have a process that relies on peer reviews, CAB meetings
etc - those things will have to change. Our goal is to be able to provision
an instance using a single API call (but we're not there yet).

9. Offline generation and validation
We generate our configs offline, verify variables and syntax (where
possible) before deployment. This way a lot of errors and inconsistencies
can be detected even before touching the network/systems. Failing here is
'cheap' - nothing is really changed yet. If the failure happens during
deployment it is more 'expensive' - it has to be rolled back carefully on a
number of devices. Each service and resource is responsible for its own
validation. Some of them query external data sources, some query live
devices (for example to make sure that that VLAN id is not in use), some
only do syntactic and semantic validation.

10. Post-deployment verification
Once all bits are pieces are in - how to confirm that the setup is actually
working? For example for a L3VPN that might mean prefixes visible in
routing tables on devices, ICMP ping working between different PEs etc. For
things like BGP sessions with customers (and any other customer-dependant
services) it's worth marking them as 'soft' failures at this stage.

11. RBAC
Who should have access to what products, on which devices they can deploy?

kind regards
Pshem
Post by Antti Ristimäki
Hi colleagues,
This is something that I've been thinking quite a lot, so I would be
delighted to hear some comments, experiences or recommendations.
So, now that more and more of us are automating their network, there will
be the question about how to manage the configurations, if they are
partially automated and partially manually maintained. This will be the
case especially while transitioning from a pure CLI jockey network towards
a more automated one. There are probably multiple approaches to solve this,
One option is to generate the whole config automatically e.g. from a
template or a database and just _not_accepting_ any manual configurations
at all. Then when there are needs to do something custom not yet supported
by the automation tools, instead of manually configuring it one would take
some additional time and build the support into the automation tools. The
cost for this might be that deploying something new/custom/tailor-made
might take a bit more time compared to just manually configuring it, but in
a long run the benefits are obvious. I'm personally preferring this
approach.
Generating the _whole_ configuration automatically off-line from the
scratch makes it also easy to remove elements from the configuration, as
the auto-generated config can completely replace the existing
running-config.
If the above mentioned is not doable for the entire configuration, one can
take one configuration hierarchy level at a time and automate it, after
which no manual configurations will be accepted under that hierarchy. This
is rather trivial especially for those configuration hierarchies that tend
to be static most of the time.
Another option is to apply the auto-generated configuration via
apply-groups and apply all manual configurations explicitly so that the
automatic and manual configurations merge with each other. The positive
side of this approach is that it makes easy to develop the automation tools
so that manual configs are not overridden by auto-generated config, but I
personally see somewhat inconvenient that one really doesn't see the
effective running-config when using apply-groups, unless one remembers to
display inheritance.
Any thoughts appreciated.
Antti
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.

Loading...