Original Article

Determination of Interobserver Correlation in the Evaluation of Liver Histopathology of Chronic Hepatitis B Patients and the Reflections on Treatment


  • Yasemin DURDU
  • Zehra Sibel KAHRAMAN
  • Ganime ÇOBAN
  • Merve CİN

Received Date: 29.03.2021 Accepted Date: 10.05.2021 Bezmialem Science 2022;10(3):299-304


Histopathological examination of the liver is the gold standard in the follow-up and treatment of chronic hepatit B vius (HBV) disease. Ishak’s Modified histological activity index (HAI) and fibrosis staging system are usually used in Turkey. Although a common scoring system is used, the same sample can be interpreted differently between different pathologists due to various variables. In this study, the evaluation of liver histopathologies of chronic HBV patients by pathologists in different hospitals and the correlation of the results with each other and the effect on the treatment decision were investigated.


Pathology slides of liver biopsy materials of 10 patients were evaluated by pathologists in 5 different tertiary care hospitals. Using non-parametric statistical methods, the coefficient of agreement between pathologists was determined. Also, descriptive statistics were used to determine the percentage of receiving treatment.


Agreement between pathologists was calculated the most in total HAI and Fibrosis score (k=0.8186, k=0.8217). The Kuder-Richardson reliability coefficient among centres was found to be high in the treatment decision (k=0.8207). Although all patients were indicated for treatment according to The European Association for the Study of the Liver 2017 guideline, it was calculated that an average of 58% of the patients could receive treatment according to liver histopathology.


Differences in the pathological diagnosis between pathologists in centres may cause delays in chronic hepatitis B patients' access to treatment.

Keywords: Liver biopsy, liver histopathology, interobserver agreement, Modified Ishak scoring system


Hepatitis B virus (HBV) is a hepatotropic DNA virus that can cause acute and chronic hepatitis. As a result of chronic diseases caused by HBV, fatal complications such as liver failure, hepatocellular cancer and liver cirrhosis may develop (1). Liver biopsy has an important place in the diagnosis of these complications due to HBV and in deciding the treatment. Many scoring and staging systems have been established in order to increase agreement among pathologists and to establish a standard in the histopathological examination of the liver. While deciding to start treatment in patients with chronic hepatitis B in Turkey, İshak’s Modified Histological Activity index (HAI) and İshak’s fibrosis staging system (FSS) are generally used (2).

Histopathological examination of the liver is affected by many variables. At the beginning of these variables, there are features related to the biopsy application such as the size of the tissue examined, whether it is fragmented or not, and whether it is taken under the capsule. In addition, errors in the preparation stages, the experience of the pathologist and the scoring systems used can also affect the histopathological examination (3-8).

Modified Ishak scoring examines in detail the main lesions such as interphase hepatitis, confluent necrosis, apoptosis and inflammation. It also makes a detailed evaluation by using 7 different scores in fibrosis staging (Table 1). The fact that Ishak’s Modified HAI and fibrosis staging are so detailed increases its distinguishing and descriptive feature, while decreasing its reproducibility (6).

The conditions and rules for providing health services by the state in Turkey are specified in the Health Implementation Communiqué (SUT). According to this communiqué, liver biopsy is mandatory in order to start antiviral therapy in patients with khronic HBV, unless there are contraindications, except for a few exceptional cases (9). In patients with HBV DNA level above 2,000 IU/mL according to SUT, treatment can be started in patients with liver biopsy score of HAI ≥6 or FSS ≥2 according to Ishak. Scoring systems are important for the standardization of the evaluation of patients, but we can still witness different results reported in the same sample among pathologists in daily practice. One-point differences in the interpretation of scoring among pathologists can be critical in whether patients receive treatment or not.

Our aim in this study is to determine the consistency of the histopathological examinations of liver biopsy samples obtained from patients with HBV according to the Modified Ishak scoring system among different pathologists in different hospitals and to examine the reflections of the differences in treatment.


Ethics Committee

The ethics committee approval of our study was obtained from the Clinical Research Ethics Committee of Bakırköy Dr Sadi Konuk Training and Research Hospital with the decision number 2019/94.


In our study, 10 patients who were admitted to the infectious diseases outpatient clinics in March 2019 and underwent liver biopsy were included in the study. The histopathological preparations of 10 patients who were planned to be treated for chronic HBV disease and underwent liver biopsy were evaluated by pathologists in five different tertiary care hospitals. A total of 20 preparations stained with Hematoxylin & Eosin and Mason Trichrome stains were evaluated by five different pathologists according to Ishak’s Modified HAI and FSS. The results were processed into Excel spreadsheets.

Inclusion/Exclusion Criteria

Patients older than 18 years of age who were followed up for chronic HBV and had phase 2 and phase 4 characteristics according to the EASL (The European Association for the Study of the Liver) 2017 guideline and had liver biopsy indication were included in the study (10). Patients with non-HBV liver disease were excluded from the study. Likewise, patients with contraindications for liver biopsy, pregnant women, and patients whose biopsy material contained less than 5 portal areas were excluded from the study.

Statistical Analysis

Goodness of agreement between pathologists (inter-observer) was evaluated with Kendall’s W Coefficient of Agreement, which was one of the non-parametric statistical methods. For this purpose, separate coefficients were calculated for the A, B, C and D categories of the Modified HAI grading system detailed in Table 1. The same was calculated for fibrosis staging and HAI Total score. For treatment, HAI 6 and above and/or fibrosis 2 and above were accepted (according to SUT 2018). However, Kuder-Richardson Confidence coefficient (K-R 20) was calculated because whether or not to receive treatment was yes/no and 0/1 according to binary system. Descriptive statistics were also used when necessary. IBM SPSS 23 package program was used for statistical calculations.


Half of the 10 patients participating in the study were male and half were female, and their ages ranged between 22 and 61. The average age of women was 47 (35-61) and the average age of men was 36.6 (22-57). The data of the patients are given in Table 2.

The histopathological examination results of the centers are summarized in Table 3. HAI results were first given, and then categorical details were given, and FSS was shown in the same table. Inter-observer agreement was high for category A and D scores in HAI grading (k=0.8186, k=0.8217), but there was no agreement between observers for category B scores in HAI grading (Kendall’s W k<0.5), and observer-observer agreement for category C scores. Although there was agreement between them, it was not high. The agreement between observers was high in the total score of HAI grading and FSS. In the treatment decision, K-R 20 coefficient was considered reliable because it was above 0.8.

The closer Kendall’s coefficient of agreement is to one, the more consistent the scores given by the pathologists are, the closer it is to zero, the more inconsistent the scores are, and it means there is no similarity. The Kuder-Richardson Reliability coefficient (K-R 20) is a value between zero and one, but the closer it is to one, the higher the reliability. The calculated coefficients are given in Table 4.

While all of the current patients (100%) had a treatment indication according to the 2017 EASL guidelines, an average of 58±38% had treatment indications considering the histopathology criteria determined by the SUT. Although patients vary according to the centers they go to, the percentage of treatment also varies, and the percentage of receiving treatment according to the centers is 58±15% on average. The percentages of receiving treatment are shown in Table 5.


Liver histopathology in patients with chronic HBV is still the gold standard for demonstrating liver status. There are several scoring systems developed to create a standard approach in this regard (2). Although these scoring systems were created to ensure harmony between observers, various differences may occur due to the subjective perspective of the observers in the evaluation. While these differences decrease among intracentral observers, they increase among intercentral observers (4). This situation was also stated in a study published in the Journal of Hepatology, the publication organ of EASL, in 2020 (11).

Today, the most commonly used liver histopathological examination score in patients with chronic HBV is the Modified Isaac scoring system (8). In our study, fibrosis, HAI category A (Interphase hepatitis) and D (portal inflammation) results were found to be highly compatible in the interobserver evaluations in different centers. However, HAI category C (focal necrosis) was found to be acceptable at an acceptable level among observers, while HAI category B (confluent necrosis) was found to be inconsistent. In a study conducted in our country, fibrosis was found to be compatible between observers, category D was moderate, and category A and C were weakly compatible. In their study, Westin et al. (4) found the interobserver category C assessment to be of low agreement.

In the presence of cirrhosis, it is common to obtain fragmented tissue during liver biopsy. The presence of fragmentation in the tissue may also cause the fibrosis value to be scored lower (12). The liver lobe where the biopsy is performed may also cause differences in fibrosis scoring. However, even if the lobes from which the biopsy is taken are different, the fibrosis scoring may be consistent between the observers, while the HAI evaluation may be inconsistent between the observers (13).

The ultimate goal in many liver diseases is to prevent liver fibrosis, failure, cirrhosis and hepatocellular carcinoma (14). In studies, it is emphasized that the samples taken by biopsy may not show the pathology in the liver completely due to the fact that liver biopsy samples only 50 thousandths of the liver and that the heterogeneous distribution of chronic viral hepatitis in the liver (14). Nowadays, various serum biomarkers or radiological evaluation methods, in which the elasticity of the liver is measured, are more preferred instead of an invasive method such as biopsy in evaluating the status of the liver in chronic viral HBV infections (1,14).

As far as we can research, there is no study in the literature on how the interobserver agreement, which is another aim of our study, is reflected in the treatment. While all of the patients included in the study had an indication for initiation of treatment according to the 2017 EASL guideline (10), it was found that only 58% of the patients met the indication for initiation of treatment according to the histopathological criteria determined by the SUT. These indication rates vary considerably between centers. This situation may cause delay in initiation of treatment in patients and may cause progression of liver damage.

Study Limitations

The fact that the length of the biopsy specimens and whether they were fragmented were not taken into account was a limitation of the study. The small number of samples was another limiting factor.


In our country, where histopathological evaluations are accepted as criteria for starting treatment in patients with chronic HBV, incompatibility between observers in different centers may cause differences in treatment initiation rates. This situation needs to be investigated in larger studies.


