df = pd.read_csv('housing.csv')df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 13580 entries, 0 to 13579 Data columns (total 21 columns):# Column Non-Null Count Dtype --- ------ -------------- ----- 0 Suburb 13580 non-null object 1 Address 13580 non-null object 2 Rooms 13580 non-null int64 3 Type 13580 non-null object 4 Price 13580 non-null float645 Method 13580 non-null object 6 SellerG 13580 non-null object 7 Date 13580 non-null object 8 Distance 13580 non-null float649 Postcode 13580 non-null float6410 Bedroom2 13580 non-null float6411 Bathroom 13580 non-null float6412 Car 13518 non-null float6413 Landsize 13580 non-null float6414 BuildingArea 7130 non-null float6415 YearBuilt 8205 non-null float6416 CouncilArea 12211 non-null object 17 Lattitude 13580 non-null float6418 Longtitude 13580 non-null float6419 Regionname 13580 non-null object 20 Propertycount 13580 non-null float64 dtypes: float64(12), int64(1), object(8) memory usage: 2.2+ MB
df.isnull().any()
Suburb False Address False Rooms False Type False Price False Method False SellerG False Date False Distance False Postcode False Bedroom2 False Bathroom False Car True Landsize False BuildingArea True YearBuilt True CouncilArea True Lattitude False Longtitude False Regionname False Propertycount False dtype: bool
df.columns
Index(['Suburb', 'Address', 'Rooms', 'Type', 'Price', 'Method', 'SellerG','Date', 'Distance', 'Postcode', 'Bedroom2', 'Bathroom', 'Car','Landsize', 'BuildingArea', 'YearBuilt', 'CouncilArea', 'Lattitude','Longtitude', 'Regionname', 'Propertycount'],dtype='object')
has_null_cols = df.columns[df.isnull().any()].tolist() has_null_cols# ['Car', 'BuildingArea', 'YearBuilt', 'CouncilArea']
df.isnull().any()[df.isnull().any() == True].index.tolist()# ['Car', 'BuildingArea', 'YearBuilt', 'CouncilArea']
df.isnull().sum()[df.isnull().sum() > 0].index.tolist()# ['Car', 'BuildingArea', 'YearBuilt', 'CouncilArea']