Security Operations, Operations Procedures and Responsibilities
Cyber Security Operations Strategy & Design to give you a better security posture.
Our global security consultants have decades of experience advising private clients and corporations across industries that range from construction, manufacturing, and transportation to education, hospitality, and government. We can help you create a robust security environment with services that include current and emerging threat assessments, policy review and development, and master planning
Operations security involves planning and sustaining the day-to-day “rubber meets the road” processes that are critical to maintaining the security of institutions’ information environments. The extent and complexity of security operations will vary between institutions based on institutional risk tolerances and resource levels. However, each of the control areas in this chapter must be addressed in some manner to help mitigate common ubiquitous risks. The most important aspect of operations security is that the operations themselves need to be repeatable, reliable, and consistently performed.
If you are just starting an information security program or looking to evaluate and improve operations security then the following approach can be very helpful:
- Review the following areas to assess the confidentiality, integrity, and availability of operations center controls:
- Operational procedures and responsibilities
- Review documentation and evaluate guidance in regards to change management, capacity management, and separation of development, test, and production environments
- Malware detection and prevention controls
- Evaluate their level of effectiveness
- Data center backup strategy
- Evaluate whether backup procedures and methods (e.g., encryption) are effective both for on- and off-premises backup management
- Audit trails and logging
- Review whether they are implemented effectively so that security reviews can be conducted to detect tampering, unauthorized access, and record user activities
- Installation of software on operational systems
- Ensure licensing requirements are met
- Operational procedures and responsibilities
Implement a formal vulnerability management program to proactively test IT infrastructure for vulnerabilities that can be exploited and ensure that there is an effective process in place to manage corrective actions in collaboration with stakeholders.
Prepare in advance for IT controls audits to avoid service disruptions
To be effective in reducing information security risk and ensuring correct computing, the security program needs to include operational procedures, controls, and well-defined responsibilities. These are complemented and often necessitated by formal policies, procedures, and controls which are necessary to protect exchange of data and information through any type of communication media or technology.
We will briefly examine 7 key effective security control areas in this chapter:
- Operational Procedures and Responsibilities (important operational processes include: Change Management; Capacity Management; Separation of Development, Test, and Operations Environments)
- Protection from Malware
- Logging and Monitoring
- Control of Operational Software
- Technical Vulnerability Management
- Information System Audit Considerations
Operational Procedures and Responsibilities
To ensure the effective operation and security of information processing facilities.
Documented Operating Procedures
Key Question: Do we have a procedures that are readily available, periodically updated, and consistently executed?
Operating procedures must be documented and readily available to the teams for which they have relevance. These procedures should cover methods that reduce the likelihood of introducing or enhancing risks due to accidental or ill-advised changes. Before authoring documentation, it is often very important to identify up-front who the intended audience is. For instance, documentation that is intended to to have value for new hires (continuity) often requires a greater degree of detail than steps for staff who regularly perform operations tasks.
It is very important that operating procedures be treated as formal documents that are maintained and managed with version and approval processes and controls in place. As technology and our systems infrastructure changes, it is an absolute certainty that operational procedures will become out of date or inaccurate. By adopting formal documentation and review processes, we can help reduce the likelihood of outdated procedures that bring forth their own risks -- loss of availability, failure of data integrity, and breaches of confidentiality.
What should we document?
As mentioned before, the decision on what areas deserve documentation must be informed by an understanding of organizational risks including issues that have previously been observed. However a good list of items to consider include the following items:
Configuration and build procedures for servers, networking equipment, and desktops.
Automated and Manual Information Processing
System scheduling dependencies
Change Management Processes
Capacity Management & Planning Processes
Support and escalation procedures
System Restart and Recovery
Logging & Monitoring Procedures
“Not enough time." Very often operations teams already have considerable responsibility and may indicate that there is simply not enough time for documenting processes. The allocation of time for documentation efforts is a management issue and for this reason it is important that IT leadership have an understanding of risks associated with outdated or informal operational procedures. In addition, defining a mandatory requirement that documentation efforts be completed before closing a project or significant change can help.
- Wiki Software + Process Documentation - Wiki software can sometimes help establish a system of documenting and centrally maintaining operating procedures. This software often usually supports change tracking and can also easily help identify procedure documents that may not have been updated in some time.
- IT Operations Manuals - Many IT organizations have found benefit in collating operational procedures into IT Operations Manuals that is available to relevant staff. This approach also has significant benefits and tie-in when considering Disaster Recovery.
Change Management Procedures
Key Question: Do we have a formal method for classifying, evaluating, and approving changes?
Change management processes are essential for ensuring that risks associated with significant revisions to software, systems, and key processes are identified, assessed, and weighed in the context an approval process. It is critical that information security considerations be included as part of a change review and approval process alongside other objectives such as support and service level management.
What change should we evaluate and how to get started?
Change management is a broad subject matter (see resource below for additional reading) , however some important considerations from an information security perspective include:
- Helping to ensure that changes are identified and recorded.
- Assessing and reporting on information security risks relevant to proposed changes.
- Helping classify changes according to the overall significance of the change in terms of risk.
- Helping establish or evaluate planning, testing, and “back out” steps for significant changes.
- Helping ensure that change communications is handled in structured manner (see RACI matrix below).
- Ensure that emergency change processes are well defined, communicated, and that security evaluation of these changes is also performed post-change.
“Change Management Takes Too Much Time.” Change management processes are notoriously susceptible to becoming overly complex. Staff who conduct changes are more likely to attempt to bypass change management processes they feel are too burdensome by intentionally classifying their changes at low levels or even not reporting them. If you are starting a Change Management program it is often helpful to first focus on modeling large scale changes and then working to find the right change level definitions which helps balance risk reduction with operational agility and efficiency.
Business Impact Analysis - Undertaking a Business Impact Analysis can often help strengthen change management operations by developing an understanding of both system and process level dependencies. This can help to evaluate and plan for less ostensible issues that emerge due to changes that impact system interactions (e.g. cascade failures).
Capacity Management Procedures
Key Question: Do we monitor resource utilization and establish projections of capacity requirements to ensure that we maintain service performance levels?
Formal capacity management processes involves conducting system tuning, monitoring the use of present resources and, with the support of user planning input, projecting future requirements. Controls in place to detect and respond to capacity problems can help lead to a timely reaction. This is often especially important for communications networks and shared resource environments (virtual infrastructure) where sudden changes in utilization can in poor performance and dissatisfied users.
To address this, regular monitoring processes should be employed to collect, measure, analyse, and predict capacity metrics including disk capacity, transmission throughput, service/application utilization.
Also, periodic testing of capacity management plans and assumptions (whether tabletop exercises or direct simulations) can help proactively identify issues that may need to be address to preserve a high level of availability of services for critical services.
- Emergency Operations - Many campuses who have experienced a crisis have seen dramatic surges of requests for information from institutional websites. If there are units at your institution who plan and manage emergency operations then partnering to evaluate the capacity management implications of varied emergency response scenarios can often be helpful.
- Cloud Service Models + Resource Elasticity - Enterprise Cloud service models including PAAS, IAAS, and SAAS often offer attractive resource elasticity features (in some cases to automatically scale rapidly in response to demand). When considering these benefits and risk reduction capabilities, it is also important to understand and review other security considerations relevant to Cloud Computer (see Cloud Computing Security Hot Topic).
Protection from Malware
To protect the confidentiality, integrity, and availability (CIA) of information technology resources and data.
Key Question: Do we have effective security controls to prevent, detect, and recover from malware threats?
To ensure the integrity and availability of information processed and stored within information processing facilities.
Key Question: Do we make copies of information, software, and system images regularly and in accord with policy requirements?
System backups are a critical issue and the integrity and availability of important information and software should be maintained by making regular copies to other media. Risk assessments should be used to identify the most critical data. Develop well-defined procedures. Establish well-defined long term storage requirements and testing/business continuity planning.
Logging and Monitoring
Key Question: Do we have processes and methods to reliably record, store, monitor, and review system events?
Effective logging allows us to reach back in time to identify events, interactions, and changes that may have relevancy to the security of information resources. A lack of logs often means that we lose ability to investigate events (e.g. anomalies, unauthorized access attempts, excessive resource use) and perform root cause analysis to determine causation. In the context of this control area, logs can be interpreted very broadly to include automated and hand written logs of administrator and operator activities taken to ensure the integrity of operations in information processing facilities, such as data and network centers.
How do we protect the value of log information?
Effective logging strategies must also consider how log data can be protected against tampering, sabotage, or deletion that devalues the integrity of log information. This usually involves consideration of role based access controls that partition the ability to read and modify log data based on business needs and position responsibilities. In addition, timestamp information is extremely critical when performing correlation analysis between log sources. One essential control needed to assist with this is ensuring that institutional systems all have their clocks synchronized to a common source (often achieve via NTP server) so that timelining of events can be performed with high confidence.
What should we log?
The question of what types of events to log must take into consideration a number of factors including relevant compliance obligations, institutional privacy policies, data storage costs, access control needs, and the ability to monitor and search large data sets in an appropriate time frame. When considering your overall logging strategy it can very often be helpful to “work backwards”. Rather than initially attempting to catalog all event types, it can be useful to frame investigatory questions beginning with those issues that occur on regular basis or have a potential to be associated with significant risk events (e.g. abuse/attacks on ERP systems). These questions can then lead to a focused review of the security event data that has the most relevance to these particular questions and issues. Ideally events logs should include key information including:
User IDs, System Activities; Dates, Times and Details of Key Events
Device identity or location, Records of Successful and Rejected System Access Attempts;
Records of Successful and Rejected Resource Access Attempts; Changes to System Configurations; Use of Privileges,
Use of System Utilities and Applications; Files Accessed and the Kind of Access; Network Addresses and Protocols;
- Alarms raised by the access control system, Activation and De-activation of Protection systems, such AV & IDS
Control of Operational Software
Make sure to establish and maintain documented procedures to manage the installation of software on operational systems. Operational system software installations should only be performed by qualified, trained administrators. Updates to operational system software should utilize only approved and tested executable code. It is ideal to utilize a configuration control system and have a rollback strategy prior to any updates. Audit logs of updates and previous versions of updated software should be maintained. Third parties that require access to perform software updates should be monitored and access removed once updates are installed and tested.
Technical Vulnerability Management
Technical vulnerabilities can introduce significant risks to higher-education institutions that can directly lead to costly data leaks or data breach events. Even with this fact is widely acknowledged, developing frameworks for detecting, evaluating, and rapidly addressing vulnerabilities is often a significant challenge. To help us approach this section it is often useful to look at 5 critical success factors (below) that drive effective threat and vulnerability management approaches.
- Knowing What We Have (Asset Inventory): It is imperative to have an up-to-date inventory of your asset groups to allow for action to be taken once a technical vulnerability if reviewed and a mitigation strategy agreed on. These inventories also lend us the ability identify and prioritize “high risk systems” where the impact of technical vulnerabilities can be greatest.
- Establishing Clear Authority to Review Vulnerabilities: Because probing a network for vulnerabilities can disrupt systems and expose private data, higher education institutions need a policy in place and buy-in from the top before performing vulnerability assessments. Many organization address this issue in their acceptable use policies, making consent to vulnerability scanning a condition of connecting to the network. Additionally, it is important to clarify that the main purpose of seeking vulnerabilities is to defend against outside attackers. (A public health metaphor may help people understand the need for scanning-we are looking for symptoms of illness.) There is also a need for policies and ethical guidelines for those who have access to data from vulnerability scans. These individuals need to understand the appropriate action when illegal materials are found on their systems during a vulnerability scan. The appropriate action will vary between institutions (for example, public regulations in Georgia versus public regulations in California). Some organizations may want to write specifics into policy, whereas others leave policy more open to interpretation and address specific issues through procedures such as consulting legal counsel.
- Vulnerability Awareness and Context: It is important that we keep up-to-date with industry notices about technical vulnerabilities and evaluate risk and mitigation strategies. Vulnerability notices are released on a daily basis and a plan needs to be in place for how to track, analyze, and prioritize our efforts.
- Risk and Process Integration: Technical vulnerability review is an operational aspect of an overall information security risk management strategy. As such, vulnerabilities must be analyzed in the context of risks including those related to the potential for operational disruption. These risks must also have a clear reporting path that allows for appropriate awareness of risk factors and exposure. Lastly, vulnerability management should also integrated into change management and incident management processes to inform the review and execution of these areas.
- System and Application Lifecycle Integration: The review of vulnerabilities also must be integrated in system release and software development planning to ensure that potential weaknesses are identified early to both lower risks and manage costs of finding these issues prior to identified release dates. (Three approaches to managing technical vulnerabilities in application software are described in the Application Security and Software Development Life Cycle presentation from the 2010 Security Professionals Conference.)
Depending on the size and structure of the institution, the approach to vulnerability scanning might differ. Small institutions that have a good understanding of IT resources throughout the enterprise might centralize vulnerability scanning. Larger institutions are more likely to have some degree of decentralization, so vulnerability scanning might be the responsibility of individual units. Some institutions might have a blend of both centralized and decentralized vulnerability assessment. Regardless, before starting a vulnerability scanning program, it is important to have authority to conduct the scans and to understand the targets that will be scanned.
Vulnerability scanning tools and methods are often somewhat tailored to varied types of information resources and vulnerability classes. The table below shows several important vulnerability classes and some relevant tools.
Common Types of Technical Vulnerabilities
- Relevant Assessment Tools
- Application Vulnerabilities
- Web Application Scanners (static and dynamic), Web Application Firewalls
- Network Layer Vulnerabilities
- Network Vulnerability Scanners, Port Scanners, Traffic Profilers
- Host/System Layer Vulnerabilities, Authenticated Vulnerability Scans, Asset and Patch Management Tools, Host Assessment and Scoring Tools
"Scanning Can Cause Disruptions." IT operations teams are quite reasonably very sensitive about how vulnerability scans are conducted and keen to understand any potential for operational disruptions. Often legacy systems and older equipment can have issues even with simple network port scans; To help with this issue, it can often be useful to build confidence in scanning process by partnering with these teams to conduct risk evaluations before initiating or expanding a scanning program. It is also often important to discuss the “scan windows” when these vulnerability assessments will occur to ensure that they do not conflict with regular maintenance schedules.
"Drowning In Vulnerability Data and False Positives." Technical vulnerability management practices can produce very large data-sets. It is important to realize that just because a tool indicates that a vulnerability is present that there are frequently follow-up evaluations needed validate these findings. Reviewing all of these vulnerabilities is usually infeasible for many teams; For this reason, it is very important to develop a vulnerability prioritization plan before initiating a large number of scans. These priority plans should be risk driven to ensure that teams are spending their time dealing with the most important vulnerabilities in terms of both likelihood of exploitation and impact.
Information Systems Audit Considerations
It is important to ensure that all IT controls and information security audits are planned events, rather than reactive 'on-the-spot' challenges. Most organization undergo a series of audits each year ranging from financial IT controls reviews to targeted assessments of critical systems. Audits that include testing activities can prove disruptive to campus users if any unforeseen outages occur as a result of testing or assessments.
Through working with campus leadership, it should be possible to determine when audits will occur and obtain relevant information in advance about the specific IT controls that will be examined or tested.
Develop an 'audit plan' for each audit that provides information relevant to each system and area to be assessed. These audit plans should take into consideration:
Asset Inventory with contact information for system administrators/owners;
Requirements for testing/maintenance windows;
Information about backups (if applicable) in case systems later need to be restored due to unplanned outages;
Checklists or other materials provided in advance by auditors, etc.
If applicable, work with IT and campus departments to provide audit preparation services to ensure that everyone understands their roles in the audit and how to respond to auditors' questions, issues and concerns. Protecting sensitive information during audits is critical, and documents provided to auditors should be recovered if possible, shortly before audits are completed.
Any and all audit activity, to assess an operational system, should always be managed to minimize any impact on the system during required hours of operation. Any testing of operational systems that could pose an adverse effect to the system should be conducted during off hours.